Class StringHelper

java.lang.Object
com.Ostermiller.util.StringHelper

public class StringHelper extends Object
Utilities for String formatting, manipulation, and queries. More information about this class is available from ostermiller.org.
Since:
ostermillerutils 1.00.00
Author:
Stephen Ostermiller https://ostermiller.org/contact.pl?regarding=Java+Utilities
  • Constructor Details

    • StringHelper

      public StringHelper()
  • Method Details

    • prepad

      public static String prepad(String s, int length)
      Pad the beginning of the given String with spaces until the String is of the given length.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • prepad

      public static String prepad(String s, int length, char c)
      Pre-pend the given character to the String until the result is the desired length.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      c - padding character.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • prepad

      public static String prepad(int i, int length)
      Pre-pend zeros to the given integer to make a string of the given length.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      The integer is converted to a string in base 10.

      Parameters:
      i - Integer to be padded.
      length - desired length of result.
      Returns:
      padded String.
      Since:
      ostermillerutils 1.08.00
    • postpad

      public static String postpad(String s, int length)
      Pad the end of the given String with spaces until the String is of the given length.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • postpad

      public static String postpad(String s, int length, char c)
      Append the given character to the String until the result is the desired length.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      c - padding character.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • midpad

      public static String midpad(String s, int length)
      Pad the beginning and end of the given String with spaces until the String is of the given length. The result is that the original String is centered in the middle of the new string.

      If the number of characters to pad is even, then the padding will be split evenly between the beginning and end, otherwise, the extra character will be added to the end.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • midpad

      public static String midpad(String s, int length, char c)
      Pad the beginning and end of the given String with the given character until the result is the desired length. The result is that the original String is centered in the middle of the new string.

      If the number of characters to pad is even, then the padding will be split evenly between the beginning and end, otherwise, the extra character will be added to the end.

      If a String is longer than the desired length, it will not be truncated, however no padding will be added.

      Parameters:
      s - String to be padded.
      length - desired length of result.
      c - padding character.
      Returns:
      padded String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • split

      public static String[] split(String s, String delimiter)
      Split the given String into tokens.

      This method is meant to be similar to the split function in other programming languages but it does not use regular expressions. Rather the String is split on a single String literal.

      Unlike java.util.StringTokenizer which accepts multiple character tokens as delimiters, the delimiter here is a single String literal.

      Each null token is returned as an empty String. Delimiters are never returned as tokens.

      If there is no delimiter because it is either empty or null, the only element in the result is the original String.

      StringHelper.split("1-2-3", "-");
      result: {"1","2","3"}
      StringHelper.split("-1--2-", "-");
      result: {"","1","","2",""}
      StringHelper.split("123", "");
      result: {"123"}
      StringHelper.split("1-2---3----4", "--");
      result: {"1-2","-3","","4"}

      Parameters:
      s - String to be split.
      delimiter - String literal on which to split.
      Returns:
      an array of tokens.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • splitIncludeDelimiters

      public static String[] splitIncludeDelimiters(String s, String delimiter)
      Split the given String into tokens. Delimiters will be returned as tokens.

      This method is meant to be similar to the split function in other programming languages but it does not use regular expressions. Rather the String is split on a single String literal.

      Unlike java.util.StringTokenizer which accepts multiple character tokens as delimiters, the delimiter here is a single String literal.

      Each null token is returned as an empty String. Delimiters are never returned as tokens.

      If there is no delimiter because it is either empty or null, the only element in the result is the original String.

      StringHelper.split("1-2-3", "-");
      result: {"1","-","2","-","3"}
      StringHelper.split("-1--2-", "-");
      result: {"","-","1","-","","-","2","-",""}
      StringHelper.split("123", "");
      result: {"123"}
      StringHelper.split("1-2--3---4----5", "--");
      result: {"1-2","--","3","--","-4","--","","--","5"}

      Parameters:
      s - String to be split.
      delimiter - String literal on which to split.
      Returns:
      an array of tokens.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.05.00
    • join

      public static String join(String[] array)
      Join all the elements of a string array into a single String.

      If the given array empty an empty string will be returned. Null elements of the array are allowed and will be treated like empty Strings.

      Parameters:
      array - Array to be joined into a string.
      Returns:
      Concatenation of all the elements of the given array.
      Throws:
      NullPointerException - if array is null.
      Since:
      ostermillerutils 1.05.00
    • join

      public static String join(String[] array, String delimiter)
      Join all the elements of a string array into a single String.

      If the given array empty an empty string will be returned. Null elements of the array are allowed and will be treated like empty Strings.

      Parameters:
      array - Array to be joined into a string.
      delimiter - String to place between array elements.
      Returns:
      Concatenation of all the elements of the given array with the the delimiter in between.
      Throws:
      NullPointerException - if array or delimiter is null.
      Since:
      ostermillerutils 1.05.00
    • isEmpty

      public static boolean isEmpty(String s)
      True for the null string and a string of zero length, false otherwise.
      Parameters:
      s - string to test
      Returns:
      Whether or not the string is empty.
      Since:
      ostermillerutils 1.07.01
    • isBlank

      public static boolean isBlank(String s)
      True if the string is null, or has nothing but whitespace characters, false otherwise.
      Parameters:
      s - string to test
      Returns:
      Whether or not the string is blank.
      Since:
      ostermillerutils 1.07.01
    • replace

      public static String replace(String s, String find, String replace)
      Replace occurrences of a substring. StringHelper.replace("1-2-3", "-", "|");
      result: "1|2|3"
      StringHelper.replace("-1--2-", "-", "|");
      result: "|1||2|"
      StringHelper.replace("123", "", "|");
      result: "123"
      StringHelper.replace("1-2---3----4", "--", "|");
      result: "1-2|-3||4"
      StringHelper.replace("1-2---3----4", "--", "---");
      result: "1-2----3------4"
      Parameters:
      s - String to be modified.
      find - String to find.
      replace - String to replace.
      Returns:
      a string with all the occurrences of the string to find replaced.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • escapeHTML

      public static String escapeHTML(String s)
      Replaces characters that may be confused by a HTML parser with their equivalent character entity references.

      Any data that will appear as text on a web page should be be escaped. This is especially important for data that comes from untrusted sources such as Internet users. A common mistake in CGI programming is to ask a user for data and then put that data on a web page. For example:

       Server: What is your name?
       User: <b>Joe<b>
       Server: Hello Joe, Welcome
      If the name is put on the page without checking that it doesn't contain HTML code or without sanitizing that HTML code, the user could reformat the page, insert scripts, and control the the content on your web server.

      This method will replace HTML characters such as > with their HTML entity reference (&gt;) so that the html parser will be sure to interpret them as plain text rather than HTML or script.

      This method should be used for both data to be displayed in text in the html document, and data put in form elements. For example:
      <html><body>This in not a &lt;tag&gt; in HTML</body></html>
      and
      <form><input type="hidden" name="date" value="This data could be &quot;malicious&quot;"></form>
      In the second example, the form data would be properly be resubmitted to your CGI script in the URLEncoded format:
      This data could be %22malicious%22

      Parameters:
      s - String to be escaped
      Returns:
      escaped String
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • escapeSQL

      public static String escapeSQL(String s)
      Replaces characters that may be confused by an SQL parser with their equivalent escape characters.

      Any data that will be put in an SQL query should be be escaped. This is especially important for data that comes from untrusted sources such as Internet users.

      For example if you had the following SQL query:
      "SELECT * FROM addresses WHERE name='" + name + "' AND private='N'"
      Without this function a user could give " OR 1=1 OR ''='" as their name causing the query to be:
      "SELECT * FROM addresses WHERE name='' OR 1=1 OR ''='' AND private='N'"
      which will give all addresses, including private ones.
      Correct usage would be:
      "SELECT * FROM addresses WHERE name='" + StringHelper.escapeSQL(name) + "' AND private='N'"

      Another way to avoid this problem is to use a PreparedStatement with appropriate place holders.

      Parameters:
      s - String to be escaped
      Returns:
      escaped String
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • escapeJavaLiteral

      public static String escapeJavaLiteral(String s)
      Replaces characters that are not allowed in a Java style string literal with their escape characters. Specifically quote ("), single quote ('), new line (\n), carriage return (\r), and backslash (\), and tab (\t) are escaped.
      Parameters:
      s - String to be escaped
      Returns:
      escaped String
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • trim

      public static String trim(String s, String c)
      Trim any of the characters contained in the second string from the beginning and end of the first.
      Parameters:
      s - String to be trimmed.
      c - list of characters to trim from s.
      Returns:
      trimmed String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • unescapeHTML

      public static String unescapeHTML(String s)
      Turn any HTML escape entities in the string into characters and return the resulting string.
      Parameters:
      s - String to be un-escaped.
      Returns:
      un-escaped String.
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.00.00
    • escapeRegularExpressionLiteral

      public static String escapeRegularExpressionLiteral(String s)
      Escapes characters that have special meaning to regular expressions
      Parameters:
      s - String to be escaped
      Returns:
      escaped String
      Throws:
      NullPointerException - if s is null.
      Since:
      ostermillerutils 1.02.25
    • getContainsAnyPattern

      public static Pattern getContainsAnyPattern(String[] terms)
      Compile a pattern that can will match a string if the string contains any of the given terms.

      Usage:
      boolean b = getContainsAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it contains any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getEqualsAnyPattern

      public static Pattern getEqualsAnyPattern(String[] terms)
      Compile a pattern that can will match a string if the string equals any of the given terms.

      Usage:
      boolean b = getEqualsAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it equals any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getStartsWithAnyPattern

      public static Pattern getStartsWithAnyPattern(String[] terms)
      Compile a pattern that can will match a string if the string starts with any of the given terms.

      Usage:
      boolean b = getStartsWithAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it starts with any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getEndsWithAnyPattern

      public static Pattern getEndsWithAnyPattern(String[] terms)
      Compile a pattern that can will match a string if the string ends with any of the given terms.

      Usage:
      boolean b = getEndsWithAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it ends with any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getContainsAnyIgnoreCasePattern

      public static Pattern getContainsAnyIgnoreCasePattern(String[] terms)
      Compile a pattern that can will match a string if the string contains any of the given terms.

      Case is ignored when matching using Unicode case rules.

      Usage:
      boolean b = getContainsAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it contains any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getEqualsAnyIgnoreCasePattern

      public static Pattern getEqualsAnyIgnoreCasePattern(String[] terms)
      Compile a pattern that can will match a string if the string equals any of the given terms.

      Case is ignored when matching using Unicode case rules.

      Usage:
      boolean b = getEqualsAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it equals any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getStartsWithAnyIgnoreCasePattern

      public static Pattern getStartsWithAnyIgnoreCasePattern(String[] terms)
      Compile a pattern that can will match a string if the string starts with any of the given terms.

      Case is ignored when matching using Unicode case rules.

      Usage:
      boolean b = getStartsWithAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it starts with any of the terms.
      Since:
      ostermillerutils 1.02.25
    • getEndsWithAnyIgnoreCasePattern

      public static Pattern getEndsWithAnyIgnoreCasePattern(String[] terms)
      Compile a pattern that can will match a string if the string ends with any of the given terms.

      Case is ignored when matching using Unicode case rules.

      Usage:
      boolean b = getEndsWithAnyPattern(terms).matcher(s).matches();

      If multiple strings are matched against the same set of terms, it is more efficient to reuse the pattern returned by this function.

      Parameters:
      terms - Array of search strings.
      Returns:
      Compiled pattern that can be used to match a string to see if it ends with any of the terms.
      Since:
      ostermillerutils 1.02.25
    • containsAny

      public static boolean containsAny(String s, String[] terms)
      Tests to see if the given string contains any of the given terms.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may contain any of the given terms.
      terms - list of substrings that may be contained in the given string.
      Returns:
      true iff one of the terms is a substring of the given string.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • equalsAny

      public static boolean equalsAny(String s, String[] terms)
      Tests to see if the given string equals any of the given terms.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may equal any of the given terms.
      terms - list of strings that may equal the given string.
      Returns:
      true iff one of the terms is equal to the given string.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • startsWithAny

      public static boolean startsWithAny(String s, String[] terms)
      Tests to see if the given string starts with any of the given terms.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may start with any of the given terms.
      terms - list of strings that may start with the given string.
      Returns:
      true iff the given string starts with one of the given terms.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • endsWithAny

      public static boolean endsWithAny(String s, String[] terms)
      Tests to see if the given string ends with any of the given terms.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may end with any of the given terms.
      terms - list of strings that may end with the given string.
      Returns:
      true iff the given string ends with one of the given terms.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • containsAnyIgnoreCase

      public static boolean containsAnyIgnoreCase(String s, String[] terms)
      Tests to see if the given string contains any of the given terms.

      Case is ignored when matching using Unicode case rules.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may contain any of the given terms.
      terms - list of substrings that may be contained in the given string.
      Returns:
      true iff one of the terms is a substring of the given string.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • equalsAnyIgnoreCase

      public static boolean equalsAnyIgnoreCase(String s, String[] terms)
      Tests to see if the given string equals any of the given terms.

      Case is ignored when matching using Unicode case rules.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may equal any of the given terms.
      terms - list of strings that may equal the given string.
      Returns:
      true iff one of the terms is equal to the given string.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • startsWithAnyIgnoreCase

      public static boolean startsWithAnyIgnoreCase(String s, String[] terms)
      Tests to see if the given string starts with any of the given terms.

      Case is ignored when matching using Unicode case rules.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may start with any of the given terms.
      terms - list of strings that may start with the given string.
      Returns:
      true iff the given string starts with one of the given terms.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • endsWithAnyIgnoreCase

      public static boolean endsWithAnyIgnoreCase(String s, String[] terms)
      Tests to see if the given string ends with any of the given terms.

      Case is ignored when matching using Unicode case rules.

      This implementation is more efficient than the brute force approach of testing the string against each of the terms. It instead compiles a single regular expression that can test all the terms at once, and uses that expression against the string.

      This is a convenience method. If multiple strings are tested against the same set of terms, it is more efficient not to compile the regular expression multiple times.

      Parameters:
      s - String that may end with any of the given terms.
      terms - list of strings that may end with the given string.
      Returns:
      true iff the given string ends with one of the given terms.
      Since:
      ostermillerutils 1.02.25
      See Also:
    • parseInteger

      public static Integer parseInteger(String s)
      Liberal parse method for integer values. If the input string is a representation of an integer, that value will be returned. Otherwise null is returned. Surrounding white space is NOT significant.

      If the number starts with a base prefix ("0x" for hex, "0b" for binary, "0c" for octal), it will be parsed with that radix. Otherwise, the number will be parsed in base 10 radix.

      This method does NOT throw number format exceptions.

      Parameters:
      s - String containing a integer value to be parsed
      Returns:
      parsed integer value or the default value
      Since:
      ostermillerutils 1.07.01
    • parseInteger

      public static Integer parseInteger(String s, int radix)
      Liberal parse method for integer values. If the input string is a representation of an integer, that value will be returned. Otherwise null is returned. Surrounding white space is NOT significant.

      This method does NOT throw number format exceptions.

      Parameters:
      s - String containing a integer value to be parsed
      radix - number base used during parsing
      Returns:
      parsed integer value or the default value
      Since:
      ostermillerutils 1.07.01
    • parseInt

      public static int parseInt(String s, int defaultValue)
      Liberal parse method for integer values. If the input string is a representation of an integer, that value will be returned. Otherwise the default value is returned. Surrounding white space is NOT significant.

      If the number starts with a base prefix ("0x" for hex, "0b" for binary, "0c" for octal), it will be parsed with that radix. Otherwise, the number will be parsed in base 10 radix.

      This method does NOT throw number format exceptions.

      Parameters:
      s - String containing a integer value to be parsed
      Returns:
      parsed integer value or the default value
      Since:
      ostermillerutils 1.07.01
    • parseInt

      public static int parseInt(String s, int radix, int defaultValue)
      Liberal parse method for integer values. If the input string is a representation of an integer, that value will be returned. Otherwise the default value is returned. Surrounding white space is NOT significant.

      This method does NOT throw number format exceptions.

      Parameters:
      s - String containing a integer value to be parsed
      radix - number base used during parsing
      Returns:
      parsed integer value or the default value
      Since:
      ostermillerutils 1.07.01
    • parseBoolean

      public static Boolean parseBoolean(String s)
      Liberal parse method for boolean values. If the input string is a word that matches a boolean value, that boolean value will be returned. Otherwise null is returned. Comparison is case insensitive. Surrounding white space is NOT significant.

      true includes: true, t, yes, y, 1, ok

      false includes: false, f, no , n, 0, nope

      Parameters:
      s - String containing a boolean value to be parsed.
      Returns:
      true, false, or null
      Since:
      ostermillerutils 1.07.01
    • parseBoolean

      public static boolean parseBoolean(String s, boolean defaultValue)
      Liberal parse method for boolean values. If the input string is a word that matches a boolean value, that boolean value will be returned. Otherwise the default value is returned. Comparison is case insensitive. Surrounding white space is NOT significant.

      true includes: true, t, yes, y, 1, ok

      false includes: false, f, no , n, 0, nope

      Parameters:
      s - String containing a boolean value to be parsed.
      defaultValue - returned when the input string does not have a boolean value
      Returns:
      true or false
      Since:
      ostermillerutils 1.07.01