Day 18 Learnings: Text Processing and Manipulation in Linux

Today’s Linux session focused on two versatile tools: awk and sed. These commands are powerful for text processing, enabling efficient data extraction, formatting, and stream editing directly from the command line.


What I Learned This Week

  • awk Command in Unix/Linux:
    A programming language for pattern scanning and processing, commonly used for data extraction and reporting.

  • sed Command in Linux/Unix:
    A stream editor for filtering and transforming text in files or input streams.


Steps I Followed

Using the awk Command

  1. Print Specific Columns:
    Displayed the first column of a space-separated file:

     awk '{print $1}' file.txt
    
  2. Filter Rows by Condition:
    Printed rows where the second column equals a specific value:

     awk '$2 == "value" {print $0}' file.txt
    
  3. Sum Values in a Column:
    Calculated the total of the third column:

     awk '{sum += $3} END {print sum}' file.txt
    
  4. Customize Output Format:
    Created formatted output:

     awk '{printf "Name: %s, Age: %d\n", $1, $2}' file.txt
    

Using the sed Command

  1. Replace Text:
    Replaced all occurrences of a word in a file:

     sed 's/oldword/newword/g' file.txt
    
  2. Delete Specific Lines:
    Removed lines containing a particular word:

     sed '/unwantedword/d' file.txt
    
  3. Insert a Line:
    Added a new line after the second line:

     sed '2 a\This is a new line' file.txt
    
  4. Extract Lines by Range:
    Printed lines 5 to 10:

     sed -n '5,10p' file.txt
    

Problems I Encountered

  1. Handling Delimiters with awk:
    The default space delimiter caused issues when processing files with commas or tabs.

  2. Complex sed Patterns:
    Crafting precise regular expressions for certain replacements was challenging.


How I Solved These Problems

  1. Specifying Delimiters in awk:
    Used the -F option to define a custom delimiter:

     awk -F ',' '{print $1}' file.csv
    
  2. Mastering sed Patterns:
    Studied regular expression syntax and used online regex testers to refine patterns.


Resources I Used

  • Linuxize: AWK Command Tutorial

  • Linux Handbook: Sed Command Examples

  • Stack Overflow for real-world problem-solving tips.


Conclusion

The awk and sed commands are indispensable tools for efficient text processing in Linux. While awk excels at analyzing and formatting data, sed shines in stream editing and text manipulation. Mastering these commands significantly boosts productivity and opens doors to more advanced scripting.

Excited to explore additional text processing tools in the next sessions!