Class LabeledCSVParser

java.lang.Object
com.Ostermiller.util.LabeledCSVParser
All Implemented Interfaces:
CSVParse

public class LabeledCSVParser extends Object implements CSVParse
Decorate a CSVParse object to provide an index of field names. Many (most?) CSV files have a list of field names (labels) as the first line. A LabeledCSVParser will consume this line automatically. The methods getLabels(), getLabelIndex(String) and getValueByLabel(String) allow these labels to be discovered and used while parsing CSV data. This class can also be used to conveniently ignore field labels if they happen to be present in a CSV file and are not desired.
Since:
ostermillerutils 1.03.00
Author:
Campbell, Allen T. <allenc28@yahoo.com>, Stephen Ostermiller https://ostermiller.org/contact.pl?regarding=Java+Utilities
  • Constructor Details

    • LabeledCSVParser

      public LabeledCSVParser(CSVParse parse) throws IOException
      Construct a LabeledCSVParser on a CSVParse implementation.
      Parameters:
      parse - CSVParse implementation
      Throws:
      IOException - if an error occurs while reading.
      Since:
      ostermillerutils 1.03.00
  • Method Details

    • changeDelimiter

      public void changeDelimiter(char newDelim) throws BadDelimiterException
      Change this parser so that it uses a new delimiter.

      The initial character is a comma, the delimiter cannot be changed to a quote or other character that has special meaning in CSV.

      Specified by:
      changeDelimiter in interface CSVParse
      Parameters:
      newDelim - delimiter to which to switch.
      Throws:
      BadDelimiterException - if the character cannot be used as a delimiter.
      Since:
      ostermillerutils 1.03.00
    • changeQuote

      public void changeQuote(char newQuote) throws BadQuoteException
      Change this parser so that it uses a new character for quoting.

      The initial character is a double quote ("), the delimiter cannot be changed to a comma or other character that has special meaning in CSV.

      Specified by:
      changeQuote in interface CSVParse
      Parameters:
      newQuote - character to use for quoting.
      Throws:
      BadQuoteException - if the character cannot be used as a quote.
      Since:
      ostermillerutils 1.03.00
    • getAllValues

      public String[][] getAllValues() throws IOException
      Get all the values from the file.

      If the file has already been partially read, only the values that have not already been read will be included.

      Each line of the file that has at least one value will be represented. Comments and empty lines are ignored.

      The resulting double array may be jagged.

      The last line of the values is saved and may be accessed by getValueByLabel().

      Specified by:
      getAllValues in interface CSVParse
      Returns:
      all the values from the file or null if there are no more values.
      Throws:
      IOException - if an error occurs while reading.
      Since:
      ostermillerutils 1.03.00
    • getLastLineNumber

      public int getLastLineNumber()
      Get the line number that the last token came from.

      New line breaks that occur in the middle of a token are not counted in the line number count.

      The first line of labels does not count towards the line number.

      Specified by:
      getLastLineNumber in interface CSVParse
      Returns:
      line number or -1 if no tokens have been returned yet.
      Since:
      ostermillerutils 1.03.00
    • lastLineNumber

      public int lastLineNumber()
      Get the line number that the last token came from.

      New line breaks that occur in the middle of a token are not counted in the line number count.

      The first line of labels does not count towards the line number.

      Specified by:
      lastLineNumber in interface CSVParse
      Returns:
      line number or -1 if no tokens have been returned yet.
      Since:
      ostermillerutils 1.03.00
    • getLine

      public String[] getLine() throws IOException
      Get all the values from a line.

      If the line has already been partially read, only the values that have not already been read will be included.

      In addition to returning all the values from a line, LabeledCSVParser maintains a buffer of the values. This feature allows getValueByLabel(String) to function. In this case getLine() is used simply to iterate CSV data. The iteration ends when null is returned.

      Note: The methods nextValue() and getAllValues() are incompatible with getValueByLabel(String) because the former methods cause the offset of field values to shift and corrupt the internal buffer maintained by getLine().

      Specified by:
      getLine in interface CSVParse
      Returns:
      all the values from the line or null if there are no more values.
      Throws:
      IOException - if an error occurs while reading.
      Since:
      ostermillerutils 1.03.00
    • nextValue

      public String nextValue() throws IOException
      Read the next value from the file. The line number from which this value was taken can be obtained from getLastLineNumber().

      This method is not compatible with getValueByLabel(). Using this method will make getValueByLabel() throw an IllegalStateException for the rest of the line.

      Specified by:
      nextValue in interface CSVParse
      Returns:
      the next value or null if there are no more values.
      Throws:
      IOException - if an error occurs while reading.
      Since:
      ostermillerutils 1.03.00
    • getLabels

      public String[] getLabels() throws IOException
      Return an array of all field names from the top of the CSV file.
      Returns:
      Field names.
      Throws:
      IOException - if an IO error occurs
      Since:
      ostermillerutils 1.03.00
    • getLabelIndex

      @Deprecated public int getLabelIndex(String label)
      Deprecated.
      may swallow an IOException while reading the labels - please use getLabelIdx()
      Get the index of the column having the given label. The getLine() method returns an array of field values for a single record of data. This method returns the index of a member of that array based on the specified field name. The first field has the index 0.
      Parameters:
      label - The field name.
      Returns:
      The index of the field name, or -1 if the label does not exist.
      Since:
      ostermillerutils 1.03.00
    • getLabelIdx

      public int getLabelIdx(String label) throws IOException
      Get the index of the column having the given label. The getLine() method returns an array of field values for a single record of data. This method returns the index of a member of that array based on the specified field name. The first field has the index 0.
      Parameters:
      label - The field name.
      Returns:
      The index of the field name, or -1 if the label does not exist.
      Throws:
      IOException - if an IO error occurs
      Since:
      ostermillerutils 1.04.02
    • getValueByLabel

      public String getValueByLabel(String label) throws IllegalStateException
      Given the label for the column, get the column from the last line that was read. If the column cannot be found in the line, null is returned.
      Parameters:
      label - The field name.
      Returns:
      the value from the last line read or null if there is no such value
      Throws:
      IllegalStateException - if nextValue has been called as part of getting the last line. nextValue is not compatible with this method.
      Since:
      ostermillerutils 1.03.00
    • close

      public void close() throws IOException
      Close any stream upon which this parser is based.
      Specified by:
      close in interface CSVParse
      Throws:
      IOException - if an error occurs while closing the stream.
      Since:
      ostermillerutils 1.03.00