Sunday, March 24, 2013

And the Winner Is...

Two weeks ago I posted a question/contest for Who's Birthday is it in "n" number of days? I received 63 answers (within the time limit) all with some interesting and unique approaches. Here's how I went about picking the winners:  

Leap Year

One thing that most people got wrong was what happened if the hire date was on Feb 29th (during a leap year) while the current year did not have a leap year. I could do a search online to find some recent leap years, instead I decided to wrote a query to find recent leap years.
-- Find past leap years
-- https://forums.oracle.com/forums/thread.jspa?threadID=1128019
select yr, 
  case 
    when mod(yr,400) = 0 or mod(yr,4) = 0then 'Leap Year'
    else null 
  end leap
from (
  select extract (year from add_months(sysdate, -8*12)) + level - 1 yr
  from dual
  connect by level <= 8
);
Test Data

Instead of testing against the main sample EMP table I recreated the EMP table with all the dates in a given leap year (2008). I kept the same structure as the EMP table so that it would be easy to test your solutions against.
drop table emp;

-- create mock emp table with all the dates (including leap year
create table emp as
select level empno, level ename, level job, level mgr, to_date('01-01-2008', 'DD-MM-YYYY') + level - 1 hiredate, level sal, level comm, level deptno
from dual
connect by level <= 366;
Picking the winner and testing solution:

Since we had 63 entries I wasn't going to test them all. Instead I decided to randomly pick solutions (based on the entry number) and validate their solution. If they were correct I'd mark them as a winner. If not, I'd move on to the next entry until I found 2 winners.

Here's how I picked the random entries:
-- Chose random winners
select trunc(dbms_random.value(1,63))
from dual;
To test the solutions I set my current SYSDATE to 1-Feb-2013 so that it would have to encounter people that were hired on 29-Feb (even though 29-Feb doesn't exist in 2013).
ALTER SYSTEM SET fixed_date='2013-02-01-14:10:00';
Winners

The winners are Erik van Roon (aka "Belly") and Iudith Mentzel. Please email me (my email address is top right corner of blog) and I will forward your information to Red Gate so that you can claim your prize.

Thanks for all the entries and thanks again to Red Gate for offering up the prizes (licenses to their new product: Source Control for Oracle). If you haven't already seen what Source Control for Oracle is or want to download a free trial I encourage you to check it out.

8 comments:

  1. Nice tricky contest, I did not get the leap year issue correct. As I wrote to you, there is an issue not only with employees hired 2008-02-29, but also with employees hired 2009-02-28.

    Many solutions (mine included) are using ADD_MONTHS (others MONTHS_BETWEEN.) Both functions are documented to have special behaviour on the last day of the month, which is good for hiredate 2008-02-29, but bad for hiredate 2009-02-28.

    Let's hire an employee every day of 2008 and 2009:

    create table emp as
    select empno
    , 'Man hired '||to_char(hiredate,'YYYY-MM-DD') ename
    , hiredate
    from (
    select level empno
    , date '2008-01-01' + level - 1 hiredate
    from dual
    connect by level <= date '2009-12-31' - date '2008-01-01' + 1
    )
    /

    I will demonstrate with my own (wrong) solution. Pretend sysdate is 2011-02-28. Select anniversaries that date or the two following (2011-03-01 and 2011-03-02):

    select ename
    , add_months(
    hiredate
    , 12 * ceil(months_between(date '2011-02-28', hiredate) / 12)
    ) next_anniversary
    from emp
    where add_months(
    hiredate
    , 12 * ceil(months_between(date '2011-02-28', hiredate) / 12)
    ) <= trunc(date '2011-02-28') + 2
    order by hiredate
    /

    ENAME NEXT_ANN
    -------------------- --------
    Man hired 2008-02-28 11-02-28
    Man hired 2008-02-29 11-02-28
    Man hired 2008-03-01 11-03-01
    Man hired 2008-03-02 11-03-02
    Man hired 2009-02-28 11-02-28
    Man hired 2009-03-01 11-03-01
    Man hired 2009-03-02 11-03-02

    "Man hired 2008-02-29" is included with a next_anniversary = 2011-02-28. Correct because ADD_MONTHS(2008-02-29,36)=2011-02-28.

    Now pretend sysdate is 2012-02-28. Select anniversaries that date or the two following (2012-02-29 and 2012-03-01):

    select ename
    , add_months(
    hiredate
    , 12 * ceil(months_between(date '2012-02-28', hiredate) / 12)
    ) next_anniversary
    from emp
    where add_months(
    hiredate
    , 12 * ceil(months_between(date '2012-02-28', hiredate) / 12)
    ) <= trunc(date '2012-02-28') + 2
    order by hiredate
    /

    ENAME NEXT_ANN
    -------------------- --------
    Man hired 2008-02-28 12-02-28
    Man hired 2008-02-29 12-02-29
    Man hired 2008-03-01 12-03-01
    Man hired 2009-02-28 12-02-29
    Man hired 2009-03-01 12-03-01

    Correct 5 employees. "Man hired 2008-02-29" this time has anniversary 2012-02-29, which is fine.

    But notice "Man hired 2009-02-28"... He has next_anniversary 2009-02-29, because ADD_MONTHS(2009-02-28,36)=2009-02-29 !!

    Very clearly error when I pretend sysdate is 2012-02-29:

    select ename
    , add_months(
    hiredate
    , 12 * ceil(months_between(date '2012-02-29', hiredate) / 12)
    ) next_anniversary
    from emp
    where add_months(
    hiredate
    , 12 * ceil(months_between(date '2012-02-29', hiredate) / 12)
    ) <= trunc(date '2012-02-29') + 2
    order by hiredate
    /

    ENAME NEXT_ANN
    -------------------- --------
    Man hired 2008-02-29 12-02-29
    Man hired 2008-03-01 12-03-01
    Man hired 2008-03-02 12-03-02
    Man hired 2009-02-28 12-02-29
    Man hired 2009-03-01 12-03-01
    Man hired 2009-03-02 12-03-02

    That is wrong result, "Man hired 2009-02-28" should not have been included here, as his anniversary actually was "yesterday"...

    Solutions like my own based on ADD_MONTHS or MONTHS_BETWEEN will have solved leap year issue only half-way. They will handle that employees hired 02-29 should be celebrated 02-28 in non-leap years.

    But they will fail to celebrate employees hired 02-28 in non-leap years on 02-28 always, even in leap years. Because ADD_MONTHS and MONTHS_BETWEEN handle "last day of month" specially, they will wrongly be celebrated on 02-29 in leap years.

    ReplyDelete
    Replies
    1. Hello Kim,

      My name is Hamid Talebian and my email is H.Talebian@pz.agro.nl

      You can see my solution on the original blog page (I misread 15 march as 25 march and I was too late)

      There are two scenario's for an emlployee with hiredate 29-feb:

      1- Anniversary is 28-feb or 29-feb (anniversary in every years)
      2- Anniversary is allways 29-feb (anniversary only in leap years!!)

      For scenario 1:

      with
      hr_emp as
      (select empno, ename, job, mgr, hiredate, sal, comm, deptno
      , case when to_char(sysdate, 'MMDD') <= to_char(hiredate, 'MMDD') then 0 else 1 end y_ind
      ,to_char(sysdate, 'RRRR') - to_char(hiredate, 'RRRR') years
      -- ,to_char(hiredate, 'MMDD') MM_DD
      from emp)
      , emp_anniv as
      (select empno, ename, job, mgr, hiredate, sal, comm, deptno, y_ind, years
      , case to_char(hiredate, 'MMDD')
      when '0228' then to_date(to_char((to_char(hiredate, 'RRRR') + years + y_ind))||'0228', 'rrrrmmdd')
      else add_months(hiredate, 12 * (y_ind + years))
      end anniversary
      from hr_emp
      )
      select * from emp_anniv
      where anniversary between trunc(sysdate) and trunc(sysdate) +30

      For scenario 2:

      with
      hr_emp as
      (select empno, ename, job, mgr, hiredate, sal, comm, deptno
      , case when to_char(sysdate, 'MMDD') <= to_char(hiredate, 'MMDD') then 0 else 1 end y_ind
      ,to_char(sysdate, 'RRRR') - to_char(hiredate, 'RRRR') years
      -- ,to_char(hiredate, 'MMDD') MM_DD
      from emp)
      , emp_anniv as
      (select empno, ename, job, mgr, hiredate, sal, comm, deptno, y_ind, years
      , case to_char(hiredate, 'MMDD')
      when '0228' then to_date(to_char((to_char(hiredate, 'RRRR') + years + y_ind))||'0228', 'rrrrmmdd')
      else add_months(hiredate, 12 * (y_ind + years))
      end anniversary
      , add_months(hiredate, 12 * (4 * ceil( (y_ind + years) / 4))) anniversary_spec
      from hr_emp
      )
      select * from emp_anniv
      where
      (to_char(hiredate, 'MMDD') = '0229' and anniversary_spec between trunc(sysdate) and trunc(sysdate) +30)
      or
      (to_char(hiredate, 'MMDD') != '0229' and anniversary between trunc(sysdate) and trunc(sysdate) +30)

      In both solutions if hiredate is on 28-02, the anniversary is allways 28-02

      Warm regards,

      Hamid Talebian

      Delete
    2. Hi Kim,

      you are the master and i usally have no right to add something to your valuable remarks, but i cant follow your last conclusion. Cont reproduce the error with my quick hack below:

      with data as (
      select date '2012-02-28' d from dual
      )

      select
      ename
      ,hiredate
      ,add_months(hiredate,
      months_between(trunc(d,'YYYY'),trunc(hiredate,'YYYY'))
      ) birthday
      from test_emp, data
      where
      add_months(hiredate,
      months_between(trunc(d,'YYYY'),trunc(hiredate,'YYYY'))
      )
      between
      d
      and
      d + 30

      Delete
  2. Kim Berg Hansen pointed out to me that my query was not as robust as should be.
    Thanks for that Kim.

    He showed me that it had 2 flaws:
    1. a person hired on the February 28 in a non-leap year would have anniversaries on February 29 in leap years, because of the way add_months works with last-day-of-the-month dates.
    2. It did not list people having a anniversary just after new year if sysdate is just before new year (and the anniversary is in fact within 30 days).

    I addressed both issues, and then found out that I liked the solution Kim build for issue 2 a lot better then my own.
    So, with his permission, I incorporated that in my query (ceil(add_months())).

    Here it is, the updated version:


    DEFINE no_of_days=30

    WITH raw_dates
    AS (SELECT ename
    , hiredate
    , ADD_MONTHS(hiredate
    ,12 * GREATEST (CEIL (MONTHS_BETWEEN (SYSDATE,hiredate)/12),1)
    ) raw_anniversary_date
    FROM emp
    )
    , dates
    AS (SELECT ename
    , hiredate
    , CASE
    WHEN EXTRACT (DAY FROM raw_anniversary_date) > EXTRACT (DAY FROM hiredate)
    THEN raw_anniversary_date - 1
    ELSE raw_anniversary_date
    END anniversary_date
    FROM raw_dates
    )
    SELECT ename
    , hiredate
    , anniversary_date
    , anniversary_date - TRUNC(SYSDATE) "days from now"
    FROM dates
    WHERE (anniversary_date - TRUNC(SYSDATE)) BETWEEN 0 AND &no_of_days
    ;


    explanation:
    Query 'raw_dates' calculates the next anniversary date on or after sysdate
    It does so by first calculating the number of years between hiredate and now (months_between/12), rounding this up (ceil) and then adding his amount of years times 12 months to hiredate.
    If the number of years appears to be 0, hiredate equals today, so next anniversary will be next year. Hence greatest(... , 1)

    Note that at this point a hiredate of 28th of February in a non-leap year is the last day of the month, and so for a sysdate in a leap year will return the 29th of February as raw_anniversary_date.

    To overcome this, query 'dates' takes the result-set of 'raw_dates', and if the day-of-month of raw_anniversary_date is higher then the one from hiredate, raw_anniversary_date is decreased by 1.

    Finally the main query takes the 'dates' result-set and only returns rows for which the anniversary date is between now and 30 days

    ReplyDelete
  3. http://en.wikipedia.org/wiki/Leap_year

    Leap years do not fall on centuries which are not perfectly divisible by 400. So, year 2000 is a leap year but year 1900 is not. The above sql for leap year should be changed to

    select yr,
    case
    when mod(yr,400) = 0 or (mod(yr,4) = 0 and mod(yr,100)!= 0) then 'Leap Year'
    else null
    end leap
    from (
    select extract (year from add_months(sysdate+35500, -12*12)) + level - 1 yr
    from dual
    connect by level <= 12
    );

    Added 35500 days to go to the next century...

    ReplyDelete
  4. Ajaykumar,

    What do you see wrong in Bellys and my solutions that require calculating leap year manually? There is nowhere in our code that we assume leap year happens every 4 years?

    Oracle SQL automatically handles leap years, it knows internally about the rule you are quoting. It is precisely because the functions ADD_MONTHS and MONTHS_BETWEEN are leap-year-aware functions, that we need to do a little extra work for these anniversaries.

    I would not recommend writing my own leap year calculation anyway using MOD function. The definition of leap year might be subject to change if international standards committee for example decides year 2200 is a leap year in order to fix if the earth is slowing down (unlikely, but you get the point :-) It is far better to rely on database built-in leap year calculation - then any change in definition will be Oracles problem and will be fixed in a patchset or new release. Don't do yourself what you can get Oracle to do ;-)

    ReplyDelete
  5. Hello All,

    After all these interesting discussions, it looks to me that we should
    revert to "manually" dealing with the case of Feb 28/29 ...

    At first glance, it looked to me that the point at which MONTH_BETWEEN
    and ADD_MONTHS are different will do the trick, but these functions
    are both "too clever" in handling ends of months.

    So, if I go back to my first solution, which does NOT use any of the two functions, it goes like this:

    SELECT e.empno,
    e.ename,
    e.hiredate,
    d.myday,
    EXTRACT(YEAR FROM (d.myday - e.hiredate) YEAR(3) TO MONTH) AS hire_age
    FROM
    emp e,
    ( SELECT TRUNC(SYSDATE) + ROWNUM - 1 AS myday
    FROM dual
    CONNECT BY LEVEL <= &N ) d
    WHERE
    ( TO_CHAR(d.myday,'MMDD') = TO_CHAR(e.hiredate,'MMDD')
    OR
    TO_CHAR(d.myday,'MMDD') = '0228' AND TO_CHAR(e.hiredate,'MMDD') = '0229' AND
    TO_CHAR(LAST_DAY(d.myday),'DD') = '28'
    )
    /


    Many times the first idea is the closest one to THE truth.

    Thanks a lot to everybody, an instructive lesson indeed :)

    Best Regards,
    Iudith

    ReplyDelete
  6. In "when mod(yr,400) = 0 or mod(yr,4) = 0 then 'Leap Year'" mod(yr,400) = 0 is unnecessary, because every number divisable by 400 is divisable by 4. But the logic here is wrong - as Ajaykumar said, year is not leap if it is divisable by 100, except those which are divisable by 400.

    ReplyDelete