View on GitHub

jsapar

JSaPar is a Java library providing a schema based parser/producer of CSV (Comma Separated Values) and flat files.

Java Schema Parser

The javadoc within the package contains more comprehensive documentation regarding the classes mentioned below.

The JSaPar package is a java library that provides a parser for flat and CSV (Comma Separated Values) files. The concept is that a schema class denotes the way a file should be parsed or written. The schema class can be built by specifying a xml-document or it can be constructed programmatically by using java code. The output of the parser is usually a org.jsapar.Document object that contains a list of org.jsapar.Line objects which contains a list of org.jsapar.Cell objects.

Supported file formats:

Events for each line

For very large files there can be a problem to build the complete org.jsapar.Document in the memory before further processing. It may simply take up to much memory. In that case you may choose to get an event for each line that is parsed instead. You do that by registering a sub-class of org.jsapar.ParsingEventListener to the org.jsapar.input.Parser. That way you can process one line at a time, thus freeing memory as you go along.

Converter

If you are only interesting in converting a file of one format into another, you can use the org.jsapar.io.Converter where you specify the input and the output schema for the conversion. The converter uses the event mechanism under the hood, thus it reads, converts and writes one line at a time. This means it is very lean regarding memory usage.

Building java objects

Use the method org.jsapar.Parser.buildJava() in order to build java objects for each line in a file (or input). Note that in order to be able to use this feature, the schema have to be carefully written. For instance, the line type (name) of the line within the schema have to contain the complete class name of the java class to build for each line.

Converting java objects into a file

Use the class org.jsapar.input.JavaBuilder in order to convert java objects into a org.jsapar.Document, which can be used to produce the output file according to a schema.

Using xml as input

It is possilbe to build a org.jsapar.Document by using a xml document according to the XMLDocumentFormat.xsd (http://jsapar.tigris.org/XMLDocumentFormat/1.0). Use the class org.jsapar.input.XmlDocumentParser in order to convert a xml document into a org.jsapar.Document.

Examples

The files for the examples below are provided in the samples folder of the project. The JUnit test org.jsapar.JSaParExamplesTest.java contains a more comprehensive set of examples of how to use the package.

Example of reading CSV file into a Document object according to an xml-schema:

        
try(Reader schemaReader = new FileReader("samples/01_CsvSchema.xml");
    Reader fileReader = new FileReader("samples/01_Names.csv")){
   Xml2SchemaBuilder xmlBuilder = new Xml2SchemaBuilder();
   Parser parser = new Parser(xmlBuilder.build(schemaReader));
   Document document = parser.build(fileReader);
}
        
    

Example of converting a Fixed width file into a CSV file according to two xml-schemas:

        
try(Reader inSchemaReader = new FileReader("samples/01_CsvSchema.xml");
    Reader outSchemaReader = new FileReader("samples/02_FixedWidthSchema.xml")) {
    Xml2SchemaBuilder xmlBuilder = new Xml2SchemaBuilder();
    File outFile = new File("samples/02_Names_out.txt");
    try(Reader inReader = new FileReader("samples/01_Names.csv");
        Writer outWriter = new FileWriter(outFile)) {
        Converter converter = new Converter(xmlBuilder.build(inSchemaReader),
                                            xmlBuilder.build(outSchemaReader));
        converter.convert(inReader, outWriter);
    }
    Assert.assertTrue(outFile.isFile());
}
        
    

Example of converting a CSV file into a list of Java objects according to an xml-schema:

        
Reader schemaReader = new FileReader("samples/07_CsvSchemaToJava.xml");
Xml2SchemaBuilder xmlBuilder = new Xml2SchemaBuilder();
Reader fileReader = new FileReader("samples/07_Names.csv");
Parser parser = new Parser(xmlBuilder.build(schemaReader));
List<CellParseError> parseErrors = new LinkedList<>()
List<TestPerson> people = parser.buildJava(fileReader, parseErrors);
fileReader.close();
    
If you want to run this example, you will need the class org.jsapar.TstPerson within your classpath. The class is not included in the jar file or in the binary package but it can be found in the source package. As an alternative you can create your own TstPerson class and modify the schema 07_CsvSchemaToJava.xml to use that class instead. The class should contain a default constructor plus getters and setters for all the attributes used in the schema.