Скачать презентацию
Идет загрузка презентации. Пожалуйста, подождите
Презентация была опубликована 9 лет назад пользователемАлександра Шелапутина
1 Unit 3 Text Processing and System Configuration tools
2 Week -7 Text Processing Tools
3 Tools for Extracting Text File Contents: less and cat File Excerpts: head and tail Extract by Column or Field: cut Extract by Keyword: grep
4 Viewing File Contents cat: dump one or more files to STDOUT Multiple files are concatenated together less: view file or STDIN one page at a time Useful commands while viewing: /text searches for text n/N jumps to next/previous match v opens the file in a text editor less is the pager used by man
5 Some useful options to use with cat -A: Show all characters, including control characters and non-printing characters -s: Squeeze multiple adjacent blank lines into single blank line -b: Number each(non-blank) line of output Viewing File Excerpts head: Display the first 10 lines of a file tail: Display the last 10 lines of a file Use –n to change number of lines displayed Use –f to follow subsequent additions to the file Very useful for monitoring log files
6 Extracting text by keyword - grep Print lines of files or STDIN where a pattern is matched $grep john /etc/passwd $date –help | grep year Use –i to search case-insensitively Use –n to print line numbers of matches Use –v to print lines not containing pattern Use –Ax to include x lines after each match Use –Bx to include x lines before each match Use –r to recursively search a directory Use –color=auto to highlight the match in color
7 Extracting text by column or field - cut Display specific columns or file or STDIN data $cut –d: -f1 /etc/passwd $grep root /etc/passwd | cut –d: -f7 Use –d to specify the column identifier Use –f to specify the column to print Use –c to cut by characters $cut –c2-5 /usr/share/dict/words
8 Tools for Analyzing text Text Stats: wc Sorting Text: sort Comparing files: diff and patch Spell check: aspell
9 Gathering Text Statistics – wc Counts words, lines, bytes and characters Can act upon a file or STDIN $wc a.txt Use –l for only line count Use –w for only word count Use –c for only byte count Use –m for character count
10 Sorting Text Sorts text to STDOUT – original file unchanged $sort [options] file(s) Common Options -r performs a reverse -n performs a numeric sort -f ignores (folds) case of characters in strings -u (unique) removes duplicate lines in output -t c uses c as a field separator -k x sorts by c-delimited field x Can be used multiple times
11 Eliminating duplicate lines sort –u: removes duplicate lines from input uniq: removes duplicate adjacent lines from input Use –c to count number of occurences Use with sort for best effect: $sort userlist.txt | uniq -c
12
Comparing files - diff Compares two files for differences $diff foo.conf-broken foo.conf-works 5c5
13 Duplicating file changes - patch diff output stored in a file is called a patchfile Use –u for unified diff, best in patchfiles patch duplicates changes in other files (use with care) Use –b to automatically back up changed files $diff –u foo.conf-broken foo-conf-works > foo.patch $ patch-b foo.conf-broken foo.patch
14
Spell checking with aspell Interactively spell-check files: $aspell check letter.txt Non-interactively list mis-spelled words in STDIN $aspell list
Еще похожие презентации в нашем архиве:
© 2024 MyShared Inc.
All rights reserved.