Simple tokenizing with Oracle PL/SQL

July 18, 2007 by Michael

I was in need of tokenizing some comma delimited data within an Oracle Database. A pity, there’s no split for a varchar2 like java.lang.String.split. I could have used Java in the database, but that would be lame, too.j

I found this little function which uses pipelined results, that is, it returns his results while being processed and furthermore, it can be used as a table in a from clause. Great as i need no extra code to check my values in my case.

So here it is:

CREATE OR REPLACE TYPE split_tbl AS TABLE OF VARCHAR2(32767);
/
show errors;
 
CREATE OR REPLACE FUNCTION f_split (
    p_list VARCHAR2,
    p_del VARCHAR2 := ','
) RETURN split_tbl pipelined
IS
    l_idx    PLS_INTEGER;
    l_list    VARCHAR2(32767) := p_list;
    l_value VARCHAR2(32767);
BEGIN
    LOOP
        l_idx := INSTR(l_list,p_del);
        IF l_idx > 0 THEN
            pipe ROW(LTRIM(RTRIM(SUBSTR(l_list,1,l_idx-1))));
            l_list := SUBSTR(l_list,l_idx+LENGTH(p_del));
        ELSE
            pipe ROW(LTRIM(RTRIM(l_list)));
            EXIT;
        END IF;
    END LOOP;
    RETURN;
END f_split;
/
show errors;

This thing takes two parameters, the first one the string to be tokenized, the second an optional delimiter. It returns all tokens trimmed as a row for each token.

Kudos to Scott Stephens.

4 comments

  1. Richard Filion wrote:

    Here is an alternate to the above. I also found this on the internet somewhere.

    The difference is the f_split will do everything for you (it will parse the whole string) whereas this one will only fecth one at a time. Might be useful for some 🙂

    This is the procedure:

    create or replace procedure tokenizer(iStart   IN NUMBER,
                                          sPattern in VARCHAR2,
                                          sBuffer  in VARCHAR2,
                                          sResult  OUT VARCHAR2,
                                          iNextPos OUT NUMBER) AS
      nPos1 number;
      nPos2 number;
    BEGIN
      nPos1 := Instr(sBuffer, sPattern, iStart);
      IF nPos1 = 0 then
        sResult := NULL;
      ELSE
        nPos2 := Instr(sBuffer, sPattern, nPos1 + 1);
        IF nPos2 = 0 then
          sResult  := Rtrim(Ltrim(Substr(sBuffer, nPos1 + 1)));
          iNextPos := nPos2;
        else
          sResult  := Substr(sBuffer, nPos1 + 1, nPos2 - nPos1 - 1);
          iNextPos := nPos2;
        END IF;
      END IF;
    END tokenizer;
    
    --This is to test it.
    create or replace procedure sp_test_tokenizer as
      sepr   varchar2(1);
      sbuf   varchar2(200);
      sres   varchar2(200);
      pos    number;
      istart number;
    begin
    
      sbuf   := '@0@11@222@3333@44444@555555@6666666@77777777@888888888';
      sepr   := '@';
      istart := 1;
      tokenizer(istart, sepr, sbuf, sres, pos);
      if (pos  0) then
        dbms_output.put_line(sres);
      end if;
      while (pos  0) loop
        istart := pos;
        tokenizer(istart, sepr, sbuf, sres, pos);
        dbms_output.put_line(sres);
      end loop;
    END sp_test_tokenizer;
    
    Posted on March 25, 2009 at 2:33 PM | Permalink
  2. Michael wrote:

    Hey Richard,

    thanks for your input!

    I need my f_split versions for selects like:

    SELECT * FROM TABLE(f_split('eins@@zwei@@drei'));

    That is i can use it as join table in arbitrary selects.

    Have a great day,
    Michael.

    Posted on March 25, 2009 at 2:50 PM | Permalink
  3. DJ wrote:

    Can you give some example to use with a simple select statement?

    Posted on January 22, 2011 at 1:04 AM | Permalink
  4. Michael wrote:

    DJ: Se my comment above…

    Posted on January 24, 2011 at 9:25 AM | Permalink
Post a Comment

Your email is never published. We need your name and email address only for verifying a legitimate comment. For more information, a copy of your saved data or a request to delete any data under this address, please send a short notice to michael@simons.ac from the address you used to comment on this entry.
By entering and submitting a comment, wether with or without name or email address, you'll agree that all data you have entered including your IP address will be checked and stored for a limited time by Automattic Inc., 60 29th Street #343, San Francisco, CA 94110-4929, USA. only for the purpose of avoiding spam. You can deny further storage of your data by sending an email to support@wordpress.com, with subject “Deletion of Data stored by Akismet”.
Required fields are marked *