Views:
1,899â
Votes: 3â
Tags:
command-line
text-processing
grep
Link:
đ See Original Answer on Ask Ubuntu ⧠đ
URL:
https://askubuntu.com/q/1019411
Title:
Extract one element from lines of a text file
ID:
/2018/03/26/Extract-one-element-from-lines-of-a-text-file
Created:
March 26, 2018
Edited: March 27, 2018
Upload:
September 15, 2024
Layout: post
TOC:
false
Navigation: false
Copy to clipboard: false
This is one of those questions where it is helpful to have test input file and examples of desired output.
Input File
Here is a test input file I copied from the Internet and modified to encase search words within **
pairs:
$ cat ~/Downloads/wordlist.txt
**Schadenfreude**
This is a German word, although used in English too, which is used to mean âmalicious enjoyment of the misfortunes of othersâ. It comes from the joining of the words schaden meaning âharmâ and freude meaning âjoyâ.
**Waldeinsamkeit**
Ever found yourself wandering alone through a forest and wanting to express the emotion brought about by that wander? Look no further! In German, Waldeinsamkeit means âwoodland solitudeâ.
**Lâesprit de lâescalier**
We all know the feeling of walking away from an argument and instantly thinking of the ideal comeback, or leaving a conversation and remembering the perfect contribution to a no-longer relevant subject. In French, lâesprit de lâescalier is the term used to refer to that irritating feeling. It literally translates as âthe spirit of the staircaseâ, more commonly known as âstaircase witâ. It comes from the idea of thinking of a response as youâre leaving somebodyâs house, via their staircase.
**Schlimazel**
The Mr Men series of books by Roger Hargreaves is a staple of many a British childâs bookshelves, and there is a word which could have been created for the character Mr Bump. Like Mr Bump, a Schlimazel is âa consistently unlucky, accident-prone person, a born loserâ. It is a Yiddish word, coming from the Middle High German word slim meaning âcrookedâ and the Hebrew mazzÄl meaning âluckâ.
**Depaysement**
Ever go on holiday, only to experience a strange sensation of disorientation at the change of scenery? DĂ©paysement is a French word which refers to that feeling of disorientation that specifically arises when you are not in your home country.
**Duende**
This Spanish term implies something magical or enchanting. It originally referred to a supernatural being or spirit similar to an imp or pixie (and is occasionally borrowed in that sense into English with reference to Spanish and Latin American folklore). Now, it has adapted to refer to the spirit of art or the power that a song or piece of art has to deeply move a person.
**Torschlusspanik**
Are you getting older? Scared of being left behind or âleft on the shelfâ? This British idiom has its own word in German: Torschlusspanik, which literally translates as âpanic at the shutting of a gateâ, is used frequently in a general sense meaning âlast âminute panicâ, of the type you might experience before a deadline.
*Do*Not*Return*these four star lines
*word***
***word*
word**
Using grep
Using grep
itâs fairly straightforward to get a word list:
$ grep -E -o '\*\*[^*]{,20}\*\*' ~/Downloads/wordlist.txt
**Schadenfreude**
**Waldeinsamkeit**
**Lâesprit de lâescalier**
**Schlimazel**
**Depaysement**
**Duende**
**Torschlusspanik**
If you want to remove the **
encasing the words, add a pipe to sed
:
$ grep -E -o '\*\*[^*]{,20}\*\*' ~/Downloads/wordlist.txt | sed 's/*//g'
Schadenfreude
Waldeinsamkeit
Lâesprit de lâescalier
Schlimazel
Depaysement
Duende
Torschlusspanik
Saving index of words to a file
If you want to save your grep
and sed
output use the file redirection >
command:
$ grep -E -o '\*\*[^*]{,20}\*\*' ~/Downloads/wordlist.txt | sed 's/*//g' > ~/Downloads/wordlist-index.txt
$ cat ~/Downloads/wordlist-index.txt
Schadenfreude
Waldeinsamkeit
Lâesprit de lâescalier
Schlimazel
Depaysement
Duende
Torschlusspanik
Note original answer posted yesterday enhanced with new post today from muru on a separate Q&A: Use specified quantifier in grep to retrieve satisfied vocabulary