A customer had an interesting problem recently involving selective bit-flipping. The real scenario was a bit more complicated, but it came down to the following scenario: the client has a large number of members. Each member visits multiple times. They want to switch the value between X2 and NULL every time a ‘true’ event occurs.

So a single member might have an event record looking like this:

ID   EVENT_ID EVENT_VALUE
---------- ---------- --------------------
         1          1 false
         1          2 false
         1          3 false
         1          4 true
         1          5 false
         1          6 true
         1          7 true

The output should maintain the event record with an event_value of NULL to start. Every time the event_value is true, the result should switch between ‘X2′ and NULL. So the output should be:

ID   EVENT_ID EVENT_VALUE
---------- ---------- --------------------
         1          1 
         1          2 
         1          3 
         1          4 X2
         1          5 X2
         1          6 
         1          7 X2

The customer was producing this result using PL/SQL in a FOR loop. The optimization was a little complicated, but in the end, we got the same results in a single SQL statement. So let’s run a quick scenario with 1,000,000 members each having somewhere between 1 and 20 events:

SQL> CREATE TABLE bit_flip
  2    (id            NUMBER,
  3     event_id      NUMBER,
  4     event_value   VARCHAR2(20));

Table created.

SQL> CREATE TABLE bit_result_loop
  2    (id            NUMBER,
  3     event_id      NUMBER,
  4     event_value   VARCHAR2(20));

Table created.

SQL> CREATE TABLE bit_result_sql
  2    (id            NUMBER,
  3     event_id      NUMBER,
  4     event_value   VARCHAR2(20));

Table created.

SQL> BEGIN
  2    FOR i IN 1 .. 1000000 LOOP
  3  
  4       INSERT INTO bit_flip (id, event_id, event_value)
  5       SELECT i, rownum rn,
  6        (CASE ROUND(dbms_random.value()) 
  7                WHEN 0 THEN 'false' ELSE 'true' END)
  8         FROM all_objects
  9        WHERE rownum < dbms_random.value(low => 1, high => 20);
 10    END LOOP;
 11  END;
 12  /

PL/SQL procedure successfully completed.

SQL> COMMIT;

Commit complete.

SQL> SELECT COUNT(*) FROM bit_flip;

  COUNT(*)
----------
10004129

The standard way of going through this is to use a PL/SQL block with a FOR loop like this:

SQL> SET TIMING ON
SQL> DECLARE
  2    user_id     NUMBER := 0;
  3    bit_flip    BOOLEAN;
  4  BEGIN
  5    FOR i IN (SELECT id, event_id, event_value FROM bit_flip) LOOP
  6  
  7       -- If this check fails, then we are onto a new user
  8       IF i.id <> user_id THEN
  9          user_id := i.id;
 10          bit_flip := FALSE;
 11       END IF;
 12  
 13       -- If the event_value is true, then flip the bit
 14       IF i.event_value = 'true' THEN
 15  
 16          IF bit_flip = FALSE THEN
 17       bit_flip := TRUE;
 18          ELSE
 19       bit_flip := FALSE;
 20          END IF;
 21       END IF;
 22  
 23       IF bit_flip = TRUE THEN
 24          INSERT INTO bit_result_loop
 25       (id, event_id, event_value)
 26          VALUES (user_id, i.event_id, 'X2');
 27       ELSE
 28          INSERT INTO bit_result_loop
 29       (id, event_id, event_value)
 30          VALUES (user_id, i.event_id, NULL);
 31       END IF;
 32    END LOOP;
 33  END;
 34  /

PL/SQL procedure successfully completed.

Elapsed: 00:21:09.17
SQL> COMMIT;

Commit complete.

Elapsed: 00:00:00.01

In this simplified example, looping through 10M rows for 1M members took 21 minutes. You can imagine how long it would take in a more complex scenario with additional tables, joins and PL/SQL manipulations.

However, if you can get the same results in a single SQL statement, you can get considerable time savings. Sometimes you need to get creative in the SQL using Oracle analytics functions. If you have never used Oracle analytics functions like windowing, I highly suggest taking a look at some examples, which can be found here.

For this example, we converted the ‘true’ events into a numeric 1 and disregarded the rest. Then we used an analytic SUM to keep a running count over the individual member. Finally, we applied a MOD over 2 to that sum, which provided the flip between even and odd occurrences. The end result:

SQL> SET TIMING ON
SQL> INSERT INTO bit_result_sql
  2  SELECT id, event_id,
  3        (CASE MOD(start_count, 2) WHEN 0 THEN NULL ELSE 'X2' END)
  4    FROM (
  5         SELECT id, event_id,
  6          SUM(CASE WHEN event_value = 'true' THEN 1 ELSE 0 END)
  7             OVER (PARTITION BY id ORDER BY event_id ) start_count
  8          FROM bit_flip);

10004129 rows created.

Elapsed: 00:01:46.54
SQL> COMMIT;

Commit complete.

Elapsed: 00:00:00.02

As you can see, we got a 10x increase in performance by switching from PL/SQL to a SQL statement. That was due to a combination of removing the PL/SQL-to-SQL context switching and allowing Oracle to complete the DML in one take. Note that the SQL statement can be further optimized by running in parallel as well, which could lead to another 10x increase!

Link to script that contains the examples in this post.

 

 

 

 

 

 

 

Comment