FORQLIFT EXAMPLES

(all examples assume the forqlift command is in your path)

forqlift’s syntax is similar to that of svn and other tools: you specify some action, followed by that action’s flags. For example:

forqlift create [... options for "create" ...]

Here, the action is "create."

## - - - - - - - - - - - - - - - - - - - -
GET HELP / SEE OPTIONS


To see a brief description of forqlift actions, and syntax run:
forqlift --help

To see help for all actions, run:
forqlift --full-help

To see help for a specific action:

forqlift [action] --help

(e.g., forqlift create --help)


## - - - - - - - - - - - - - - - - - - - -
CREATE A SEQUENCEFILE

Inside the SequenceFile, each record will use the filename for the key and the file’s contents (asa Hadoop BytesWritable type) for the value.

forqlift create --file=/some/file.seq file1 file2 file3 /path/to/file4


## - - - - - - - - - - - - - - - - - - - -
CREATE A SEQUENCEFILE, TEXT DATA

This time, the value will be a Hadoop Text type, which means your Mapper and Reducer code can just fetch the contents as a big String. (If the value were still BytesWritable, you would have to first convert the raw bytes to text.)

forqlift create --file=/some/file.seq --data-type=text file1.txt file2.xml file3.txt /path/to/file4.xml


## - - - - - - - - - - - - - - - - - - - -
CREATE A SEQUENCEFILE, COMPRESSED TEXT DATA

Text tends to compress well. This can lead to big savings on bandwidth and storage, both of which are especially important if you’re on a slow line and/or you use cloud services, such as Amazon’s S3 or Elastic MapReduce.

This time, the value will be a Hadoop Text type, which means your Mapper and Reducer code can just fetch the contents as a big String. (If the value were still BytesWritable, you would have to first convert the raw bytes to text.)

forqlift create --file=/some/file.seq --data-type=text --compress=bzip2 /path/to/*.xml /another/path/*.txt

(NOTE: As of this writing, even though Elastic MapReduce supports bzip2 input files, it does not support bzip2 compression on SequenceFiles. In that case, please use gzip compression.)


## - - - - - - - - - - - - - - - - - - - -
LIST THE CONTENTS OF A SEQUENCEFILE

forqlift list /some/file.seq


## - - - - - - - - - - - - - - - - - - - -
EXTRACT THE CONTENTS OF A SEQUENCEFILE

Extract to current directory:
forqlift extract --file=/some/file.seq

Extract to another directory (paths will be created as needed):
forqlift extract --file=/some/file.seq --dir=/another/directory


## - - - - - - - - - - - - - - - - - - - -
CONVERT A ZIP OR TAR(.BZ2, .GZ) FILE TO A SEQUENCEFILE

(NOTE: This is an experimental feature!)

Note that you can also use the --data-type and --compress options, if need be.

forqlift fromarchive --file=/some/file/seq somefile.tar

You can also squeeze multiple zip or tar files into a single SequenceFile:

forqlift fromarchive --file=/some/file/seq file1.zip file2.tar.bz2 file3.tar.bz file4.tar


## - - - - - - - - - - - - - - - - - - - -
CONVERT A SEQUENCEFILE INTO ZIP OR TAR FORMAT

(NOTE: This is an experimental feature!)

forqlift toarchive --file=/some/file.tar.bz2 file1.seq

or, create one file from several SequenceFiles:

forqlift toarchive --file=/some/file.tar.bz2 file1.seq file2.seq file3.seq
## - - - - - - - - - - - - - - - - - - - -
PASS FLAGS TO FORQLIFT'S JVM (SET MEMORY, ETC)

Use the <code>FORQLIFT_OPTS</code> environment variable, the value of which gets passed to the JVM:

For example, to set forqlift's JVM memory (heap size) to 512MB:

export FORQLIFT_OPTS="-Xmx512m"
forqlift .....
## - - - - - - - - - - - - - - - - - - - -
INCREASE OR DECREASE LOGGING INFORMATION

Most of forqlift's output passes through a logging system.

To see more output (increase log level), run:

forqlift [action] --verbose [other action flags]


To completely disable logging output, run:

forqlift [action] --quiet [other action flags]


If you have uncovered a bug in forqlift, please rerun the command with "--verbose" to get more insight.

## - - - - - - - - - - - - - - - - - - - -
GET INFORMATION ABOUT FORQLIFT'S VERSION

pass the --version flag to forqlift to see the project version, and also the version of Hadoop used to build forqlift.

forqlift --version

## - - - - - - - - - - - - - - - - - - - -

