plasticglass

"bandit" part 2: file operations in the LFS

2.1 - moving, renaming, and copying files

when using a CLI-based linux system, it is often necessary to move or copy files to new directories, or to rename them. in a GUI-based environment, this could be accomplished by click-and-dragging a file icon to a new window, or by right-clicking it, and selecting "rename" or "copy".

to move a file from one location to another, the user would execute the mv command, which is short for "move". to use the command, both the source and the destination must be specified, like so:

mv sourcefilepath destinationfilepath

the path to both source and destination must be accurate down to the filename, or the command will not produce the desired result.

the mv command can also be used to rename a file, like so:

mv oldname newname

if the filenames don't match when moving a file, it will also be renamed. this can cause a lot of confusion if done unintentionally. additionally, if a file of the same name already exists in destination, it could be overwritten. adding -i to the command will require mv to prompt the user before overwriting a file.

if desired, the user can also move a copy of a file to a new location, while leaving an identical copy in the original directory. for this, the cp command is used (short for "copy"). it takes mostly the same form as mv:

cp sourcefilepath destinationfilepath

if destinationfile doesn't already exist, it will be created. if it does, its contents are overwritten with the contents of sourcefile.

2.2 - gathering information about files

in section 1.4, the file command was introduced as a way to determine the data type of a file or set of files.

this is a very useful tool for learning how to work with cetain files, but it is not the only important utility to know.

the du command, short for "disk usage", will return information about the size of a file or directory. it has several helpful options:

-h: "human readable". outputs size values in bytes rather than kilobytes.
-s: "summarize". reduces lines for individual items into one line describing the size of the directory.
-a: "all". show and analyze all individual files in the directory.
--time: shows the time the file was last changed.
-c: "total". adds a line to the output describing the total size of the combined items.

another useful utility to know is xxd. this command creates a hex dump of a specified file. it is used like this:

xxd file

if file is already a hexdump, it can be reversed by adding the -r option ("revert"). normally, xxd prints to the terminal, but the output can be redirected to a file like so:

xxd inputfile outputfile

xxd is useful because it can expose the file signature of a file. this can then be used to determine what type of file is being examined, and help determine how to work with it. many file types have unique signatures, which positively identify them.

2.3 - extracting content from files

there are several options available to find specific content within files. often, they rely on regular expressions and sorting data into a specific order.

the first option is grep, which stands for "global regular expression search and print". given a string or more complex regular expression as an argument, grep will scan a file (or files) for strings matching the pattern. it will then return the line, in its entirety, where the match is found, for each instance it finds. it is also used to scan the output of a script or command using grep.

grep takes several options, following this usage:
grep options pattern file1 file2 ...

grep -f patternfile: "file". use patterns from patternfile rather than a command line argument.
grep -i pattern: "ignore case". makes the search case-insensitive.
grep -v pattern: "invert". returns lines that do NOT match pattern.

there are many others, but these are the most useful for many cases.

another useful tool is strings, which will scan through a non-human-readable file and extract any human readable strings it finds. this is useful for analyzing binaries or other machine-language objects.

sort and uniq are often used together to order data and find anomalies and elements that stand out.

sort will organize the lines within a file into a specified order. if no specification is given, the default scheme is alphabetical order. however, these other options can come in handy:

sort -n: numerical order, based on numbers found in lines.
sort -r: reverse order, can be combined with other options.
sort -f: ignore case, i.e. treat "a" and "A" the same.

usage: sort options inputfile optionaloutputfile

uniq scans a file or input and returns unique lines. it works by removing adjacent duplicate lines, meaning that if a line appears more than once in a row, the repeats will be filtered out and the line will only be reported once. however, if the duplicate lines are not adjacent, but separate, uniq will not recognize them as duplicates. in this case, even though the line is not actually unique, it will be reported as if it is.

to prevent this, the user can first sort the file, then direct that output into uniq, like so:

sort options inputfile | uniq options >> optionaloutputfile

2.4 - editing and altering strings

the way data is presented and written can be changed via the CLI. a common style of writing data is as encoded data, using a standard encoding method. one method commonly encountered is base 64, which involves translating data from its normal base 10 (decimal) ASCII values into a base64 character set, and vice-versa. the purpose of this is to allow data to be transmitted across channels that are only able to handle text.

linux systems use a built in utility, base64, to encode a string as base 64 data. the -d option is used to decode the string.

another built in utility, tr ("translate"), is used for simple find-and-replace operations. using the format tr oldcharset newcharset, the command replaces instances of certain characters with new characters. this is useful for many things, including capitalizing, simple ciphers, etc.

the output of one command can be directed into the tr command, which can in turn be directed elsewhere. example:

cat names.txt | tr [a-z] [A-Z] >> capitalnames.txt

this string of commands will send the contents of "names.txt" to tr, which will translate the text to all capital letters, and then write that data to "capitalnames.txt".

2.5 - compressing and decompressing files

file compression is a commonly encountered procedure. the purpose of file compression is to save space and ease transmission without compromising the integrity of the data. two types of compression exist: "lossless", where unnecessary bits are removed by identifying and eliminating redundancies, and "lossy", where less important bits are removed. as the names imply, no information is lost in lossless compression, but some information is lost in lossy compression.

many different algorithms exist for the purpose of data compression. each has its own advantages, disadvantages, and optimizations. some important ones to know are:

gzip: lossless algorithm, generally has a ".gz" file extension.
bzip2: lossless algorithm, generally has a ".bz2" file extension.
jpeg: lossy algorithm for compression of images, generally has a ".jpeg" or ".jpg" file extension.
tar: not technically a compression utility, but does have the ability to compress and decompress tar archives. generally has a ".tar" file extension.

each of these has its own man page within linux, which details how to use it to compress or decompress a file.

each utility, having its own way of doing things, leaves a unique file signature.

walkthroughs

"bandit" levels 7-12

bandit 7:

the password for the next level is somewhere in a very large txt file, next to the word "millionth". use one of the search tools discussed above to find it.

relevant sections: 2.3

bandit 8:

the password is stroed in data.txt. it is on its own line, and that line only occurs once. use a tool suitable for extracting unique lines from a file.

hint: remember how the tool works, and what you need to do to get what you want.

relevant sections: 2.3

bandit 9:

the password is one of the few human-readable strings in data.txt, and is preceded by a few "=" characters.

hint: there are a few ways to find the password. you can use them on their own, or chain a few together to be more efficient.

relevant sections: 2.2 2.3

bandit 10:

the file containing the password ("data.txt") is base 64 encoded. decode it to get the password.

relevant sections: 2.4

bandit 11:

the password in data.txt has been encrypted with a ROT13 cipher, meaing all lowercase and uppercase letters have been rotated by 13 positions. reverse this encryption to get the password.

relevant sections: 2.4

relevant reading: ROT13

bandit 12:

make a temporary directory with mktemp -d. you will work in there for this level. copy data.txt to the new directory and change the cwd to that directory.

examine the hexdump of the file to find the file signature, and determine which kind of compression utility was used in the last compression. once you know, revert the hexdump, and rename the file to have the correct extension.

decompress the file according to the correct compression type. after each decompression/extraction, examine a hexdump of the file to find the file signature, which will tell you what type of decompression to use next.

repeat as needed, examining the file at each step. eventually, you will have a normal, uncompressed file. read it to find the password for bandit 13.

hint: this level is tough, and kind of long. don't get discouraged.

hint: there are a total of 3 different compression/archiving schemes used in this level.

relevant sections: 2.1 2.2 2.3 2.4 2.5

part 3 - working with remote machines