-
Notifications
You must be signed in to change notification settings - Fork 22
UNIX and the bash shell
There are some useful operators that can be used to compare numbers. This is useful when using if/else statement.
-eq: equal; -ne: not equal; -ge: greater than or equal to; -le: less than or equal to; -gt: greater than; -lt: less than.
Example: if (4 -ge 2) <-square bracket; then…
Note: "4 -ge 2" will not have any meaning outside the "if" statement.
You can save shell commands in a code file (a shell script file), which should end in the expression .sh. To be safe, so that it executes as a bash shell script from any shell, it's best to put
#!/bin/bash
as the first line of the file, so that it is always executed using the bash shell.
You can run the commands in a shell script in three ways:
./myShellFile.sh
. myShellFile.sh
source myShellFile.sh
To do it the first way, UNIX needs to know that it is an executable file, so you would need to do: chmod u+x myShellFile.sh
Text files created in UNIX have lines that end with the ASCII character n (i.e., "new line" or "line feed"). Text files created in DOS end with both r and n, i.e., rn. r is a "carriage return" and you'll sometimes see it printed as "^M" when you open a DOS text file in a UNIX text editor.
DOS text files with r in them can sometimes mess things up in UNIX because a given command or program in UNIX doesn't know what to do with the r's, and at least one of you was running into this when trying to run a shell script file.
On the SCF Linux machines, which run the Ubuntu variant of Linux, you can convert from DOS format to UNIX: fromdos myFile and to convert back: todos myFile
In other variants of UNIX, you'll need dos2unix and unix2dos in place of fromdos and todos.
export -f functionname # export the function to shell. It takes a local variable and makes it global.
for i in $(seq 1 1 $num) #create a for loop from 1 to num(integer variable). Put "read name" before this for loop to enable reading the value from input.
Hello () {echo "Hello World $1 $2"} Hello arg1 arg2 #This shows how to pass parameters to a function.
In addition to selecting columns based on a delimiter using the -d and -f flags, you can also select columns based on fixed width format. For example if you want the 2nd through 5th characters of every line in file.txt:
cut -c2-5 file.txt
- grep -v : selects non-matching lines
-
grep -o : return only the matched part of a line
-
grep -i "string" FILE #It searches for the given string/pattern case insensitively
-
grep -w "string" FILE #It searches for a word and avoids to match substrings
-
grep -A N "string" FILE #It prints N lines after the match
-
grep -B N "string" FILE #It prints N lines before the match
-
grep -C N "string" FILE #It shows N lines before and after the match
-
grep -r "string" * #Search in all files under current directory and its sub directory.
-
grep -c "string" FILE #Count how many lines match the given string
-
grep -l "string" * #Show file names which match the give pattern
-
grep -o -b "string" FILE #Show the position of match in the line
- sed 's/string1/string2/' FILE1 > FILE2 #Change the first "string1" in FILE1 to "string2" and save to FILE2
-
sed 's/string1/string2/g' FILE1 > FILE2 #Change every "string1" in FILE1 to "string2" and save to FILE2
-
Use & as matched pattern:
-
echo 123 abc | sed 's/[0-9]*/& &/' shows 123 123 abc
-
Use \1 \2 flag patterns and manipulate them:
-
echo "stat" "comp" | sed 's/ \([a-z]*\) \([a-z]*\)/ \2 \1/' shows comp stat
-
sed '2d' filename # remove the 2nd line
-
sed '$d' filename # remove the last line
-
awk 'BEGIN{FS = "|"}; 5 > 200' #Return lines with number, which is bigger than 200, in column 15. The column separator is |
-
awk -F"|" '{SUM+=5;} END {print SUM;}' #Set column separator as |,calculate column 15's sum, and print it at the end.
-
awk '{temp = ; = ; = temp; print}' #To switch column 2 and column 3.
ssh -X username@$1.berkeley.edu ps aux | sed '1d' | grep "exec/R" | sort -k3nr,3 | head -n "$2" >tmp.q #log into user defined SCF machine, delete the first line, grep lines with key word "exec/R", reverse numerical sort on column 3, get first several rows(number defined by user), and put the result in a file named " tmp".
-
head -n -3: prints the entire file except for the last 3 lines
-
tail -n +3: prints the entire file except for the first 2 lines (i.e., starting from the third line)
-
Even though +3 = 3, 'tail -n 3" is quite different to "tail -n +3"!
-
I often forget the finicky details - but it's handy to know that the functionality exists.
-
An easy way to experiment with head/tail (to check the behavior of those edge cases) is to use 'seq' (which prints out a sequence of numbers) e.g., "seq 1 20 | head -n -3"
-
head/tail -q: never prints the file names (which would otherwise happen if you have multiple files e.g., 'head file1 file2')
-
``locate <dir_name>``: locate a directory with the given name, e.g. ``locate stat243-fall-2013``
-
``find . -type d -name '<dir_name>'``: find a certain file, in this case, a directory with a given name. The syntax can be explained as follow: . refers to where to start finding. If you could narrow down to a specific place, it would be much better than, say, recursively finding a directory from root. 'd' stands for 'directory', and ``-type`` tells the shell to just find a particular type. Without that, it would find all files, so 'stat243-fall-2013.csv' could be included in the result. The flag ``-name`` means search for a directory with this name, and then you provide the name as a string. Example: ``find . -type d -name 'stat243-fall-2013'``
In general, find
is much more powerful than locate
. In fact, if you do $ man find
, you will see that you can specify the search depth, and with the -exec flag, you can also pass the found results to another command, among other things. locate
seems to be much faster if you have no idea where the file/directory could be.
- {n} where n is an integer >=1 : repeats the previous item exactly n times
-
a{2} matches aa
-
{n,m} where n>=0,m>=n : Repeats the previous item between n and m times
-
a{2,4} matches aaaa, aaa, or aa
-
Var1=`head -n1 FILE`
# assign the first line to variable Var1, Note that the backtick (`) is not a quote ('). - declare -a ARRAY # define array ARRAY, here is an example
-
declare -a Week
-
Week[0]="Sun" #Assign value
-
Week[1]="Mon" #Assign value
-
echo ${Week[1]} #Print value of Week[1]
-
Mon
-
You can use the bc command to do basic calculations. It expects input from a file so to get from a variable or the command line, you need to use a pipe. E.g.,
echo "7 + 8 + 9" | bc
If you are working on your own Mac and you haven't programmed much before, you may want to install the Developer Tools from your install DVD (they are not part of the default install). Many Unix and bash tools aren't installed unless you do this.
Even after you do this, there are some Unix utilities that aren't available by default in OSX, and this turns out to include wget. One way to install these things is to use Fink, which has a nice GUI here
external link: http://finkcommander.sourceforge.net/
In the case of wget, it also turns out that OSX does come with a very similar tool called curl that you can use instead
I found a little shell script + bit of code so that Mac OSX will navigate directories aliased in the Finder on the command line (otherwise you can't cd into those directories). This is useful for me because I usually keep all my code in one place and my course materials in another, and wanted to create an aliased subdirectory in my course directory for the R code relevant to this course:
external link: http://hints.macworld.com/article.php?story=20050828054129701
You also need to compile the c code getTrueName.c and put the executable somewhere where it will be on your PATH.
if you're on OS X the system seems to use your .bash_profile rather than .bashrc, so I put this in my .bashrc rather than worry about what gets read when (see bottom of page):
external link: http://www.joshstaiger.org/archives/2005/07/bash_profile_vs.html