Browsing"Monthly Contest"

The Winner is Jichun Wang

Dec 4, 2011 by     1 Comment     Posted under: Code, Monthly Contest, New Technologies, Tips & Techniques

The winner of our last programming challenge is Jichun Wang. Jichun is a Sun Microsystems alumnus, now a senior software engineer at Synopsis. He uses perl to call a Google RESTful API and to parse the JSON outcome from google to get the result counting of google search. You can find his program here on my SAS_ACADEMY group.

Because the API he used is deprecated, you can find there is a significant difference in the counting using this API and that you get by googling directly in browser; however, he got the order correct!

A-Hero Cnt By API Cnt By Browser
Batman 30,600,000 332 M
Iron Man 18,700,000 266 M
Superman 13,900,000 184 M
Spiderman 13,000,000 122 M

October Programming Challenge: Ranking American Heroes

Oct 5, 2011 by     2 Comments    Posted under: Code, Monthly Contest

At the beginning of each month, I follow the update of TIOBE Index very closely. TIOBE Index is a measurement of popularity of programming languages based on the result of search engines. By the way, SAS ranks 24 there last month. The way TIOBE Index is defined inspires me so I come up with the following game to play for this month:

Use any programming language you feel comfortable to rank the popularity of the following American heroes (each picture contains a link pointing to its Wikipedia entry):

Superman Batman Spiderman Ironman

Send me your source code by 31/10 to win a $25 gift card.

As for our previous challenge Motto of Hogwarts School, the winner is Jiangtang Hu. This is Jiangtang’s second time to become the winner of our Programming Challenge. Congratulation! :) His method is to tap into an online Latin-English dictionary. Other contenders approach this problem by using Google translate or digging into the search result. As said last time, the Google Translate does not give a correct answer. For example, the following perl one-liner
use LWP::UserAgent;$ua=LWP::UserAgent->new;$ua->agent('Mozilla/6.0');
print $ua->request(HTTP::Request->new('GET','http://translate.google.com/translate_a/t?client=xxxxxxx&text=draco%20dormiens%20nunquam%20titilandus&sl=la&tl=en'))->content;

returns “the dragon never sleeping titilandus” in the JSON format. To me the most Q&D solution is found in the following perl one-liner:
use LWP::UserAgent;$ua=LWP::UserAgent->new;$ua->agent('Mozilla/6.0');
foreach(split /\n/,($ua->request(HTTP::Request->new('GET','http://www.google.com/search?&q=draco+dormiens+nunquam+titillandus')))->content){
if(/mean/i){s/]*>//g,s/^.*means\s+&quot\;([^&]*)&quot\;.*$/\1/;print;}}

Programming Challenge (6): Motto of Hogwarts School

Aug 25, 2011 by     3 Comments    Posted under: Code, Monthly Contest

I am no big Harry Potter fan so I can’t answer questions like when Ron felt in love with Hermione. But that doesn’t stop me from playing this game:

Find the motto of Hogwarts School of Witchcraft and Wizardry, “draco dormiens nunquam titillandus“, in English by using a program written in whatever programming language you feel comfortable.

Something may pop up onto your mind immediately:

  1. Q(uick) approach: Google translate API
  2. Q(uick)&D(irty) approach: web text/knowledge mining
  3. Diehard approach: write something equivalent to Google translate
  4. et cetera

For the first approach, unfortunately it’s actually a failed case for Google translate (titillandus is the gerundive of titillo, titillare). For the second, I like Q&D but it may not be scalable, i.e., you can’t apply the same method to get, say, the book of Genesis in Vulgate into English. For the third one, well, I don’t think so :) .

Send the source code to me by 30/9 to win a $25 gift card from Fry’s– Well, it’s actually negotiable if you prefer Macy’s.
Hogwarts School of Witchcraft and Wizardry

BTW I used to think another topic:

Think a subset of sas programs that do not use macros. Write in any language you feel comfortable to get rid of comments.

Some contenders of our June challenge like Kalyani actually touched this topic. Then I chatted with Jiangtang–Both of us agreed it may be less interested in sas community. So I end up to leverage the Deathly Hallows. But if you want to try this code scanner topic, you are more than welcome!

A summary of our previous challenges and the winners:

March challenge: Web crawling (L0) (winner: Megha)
April challenge: Web crawling (L1) (winner: Megha)
May challenge: Eight Queens Puzzle (winner: Kalyani)
June challenge: Source line counting (winner: Megha)
July challenge: Infinitesimal in sas (winner: Jiangtang)

The Winner Of July Programming Challenge Is Jiangtang Hu

Aug 15, 2011 by     No Comments    Posted under: Code, Monthly Contest, New Technologies, Tips & Techniques

Our last programming challenge is Infinitesimal in the SAS universe of discourse. The following code and the like provide a predicate to approach that “infinitesimal”:


data _null_; x=1;
do until (0=x/2);x=x/2;c+1;end;
put c= x= binary64.;
run;

The put statement writes in the log the output:

c=1074 x=0000000000000000000000000000000000000000000000000000000000000001

In another word, the sas infinitesimal is 2^{-1074}.

Some programmers provided the solution by using sas contant “SMALL”. On the surface level and surprisingly, it is not the smallest number in sas universe:

data _null_;x=constant('SMALL'); y=x/2; z=x>y; put x= binary64. y= binary64. z=;run;

Log:

x=0000000000010000000000000000000000000000000000000000000000000000
y=0000000000001000000000000000000000000000000000000000000000000000 z=1

The rationale behind the definition of “SMALL”, from my understanding, is “SMALL” is the smallest number that the implicit bit algorithm (see IEEE 754) to encode floating point numbers is applicable.

More code to play:

data _null_; x=log2(constant('SMALL')); put x=;run;

Log:

x=-1022

And the winner is….

Jul 15, 2011 by     3 Comments    Posted under: Case studies, CDISC, Monthly Contest, SAS Library, Tips & Techniques

CDISC Express Mapping contest comes to an end today!

The challenge was to create a mapping file to map the source data set provided on this page to the SDTM DM domain using CDISC Express. Learn more about CDISC Express.

The winner is Jiangtang Hu! He is a reader and a blogger, lives in Beijing, China. He is a statistical SAS programmer at Sanofi Pasteur and a new member of the “Elite Fathers’ Club” by life.  He wins an iPad2!

Jiangtang is one of the eraly testers and adopters of CDISC Express.  He wrote a paper to help users like him ‘Dive into CDISC Express’. We posted the first part of this paper on our blog last week. There are 4 parts focused on guiding the user in the different features on the application. Next part will be published next week.

Thank you to all the participants and congratulations to Jiangtang

Programming Challenge: Infinitesimal in the SAS universe of discourse

Jul 8, 2011 by     1 Comment     Posted under: Code, Monthly Contest, New Technologies, Tips & Techniques

First of all, thank Jonathan Lee, Jiangtang Hu, Jacques Thibault, Neha Mohan and Na Li for their feedback on the question about the largest 3-byte integer in sas dataset in our Code Jeet Kune Do email list. Code Jeet Kune Do is expanded from our internal Q&A mail list for technique and knowledge sharing. If you are interested to subscribe, please shoot me a message at jian.dai@clinovo.com to add your email there.

Second, the term “infinitesimal” in the title is a bit misleading. Here is how the game is specified: Please use a program to find the minimal positive number that SAS can represent and send your code to me by August 1, 2011. The one who gets it right first wins :)

Hint: The following code is a brutal-force approach to pin down the maximal integer that SAS can represent.

data _null_;do until(x=x+1);x+1;end;put x= binary64.;run;

Don’t run it as I estimate it takes about 456 days to exit the loop on my laptop (unless you can access to a super super fast machine) :)

Reference:
Jacques and I discovered an excellent paper on this subject that you can use: Numeric Length: Concepts and Consequences. This topic is along the line of the post Play Dataset Like a CS Pro: Binary Tree as it touches some fundamental issues in the field of computation.

Megha becomes the third time winner: June Programming Challenge now is finished

Jul 8, 2011 by     2 Comments    Posted under: Code, Monthly Contest, New Technologies, Scripting, Tips & Techniques

The top three contenders are Kalyani Chilukuri from Clinovo, Jiangtang Hu from Sanofi Pasteur, and Megha Agarwal from Clinovo. Thank you all for excellent work! The winner is Megha: Her code is the closest to beat the benchmark when tested on a version of CDISC Express. Congratulation!

Benchmark code

%let _=%sysfunc(time());
filename _ temp;
data _null_;
infile 'dir/s/b "C:\CDISC Express\*.sas"|findstr/v "sas7 sas~"' pipe end=EOF;input;
if _n_=1 then call execute('proc printto log=_ new;');
call execute('data _null_; infile "'||_infile_||'"; input;');
if EOF then call execute('proc printto log=log;');
data _null_;retain s;
x=prxparse('/(\d+) records were read from the infile/');
infile _ end=EOF;input;
if prxmatch(x,_infile_) then s+input(prxposn(x,1,_infile_),best.);
if EOF then put s=;
run;
%put %sysevalf(%sysfunc(time())-&_);

Nuance The next two solutions:
i) A terser and faster sas approach by deploying more sophisticated shell command:

data _null_; infile 'for /f "tokens=* usebackq" %f in (`dir/s/b "C:\CDISC Express\*.sas"^|findstr /i /v "sas7 sas~"`) do @type "%f"' pipe;input;run;

The total number of lines can be found in the log file.
ii) If you have Cygwin or MinGW minimal system installed on your Windows machine then you can pipe the unix command “wc -l” into sas to get the line counting of each program.

However, either approach encounters a systematic error rooted in the fact that CDISC Express originally is developed on Solaris, a Unix platform. For example, apply the unix command “cat -v” on the macro “attrn.sas” (the “v” switch lets this utility print nonprinting characters) and you will get something like this:

/*******************************************************************************^M
* PROGRAM NAME: attrn.sas^M
* DESCRIPTION: ^M
* - Open a dataset and get one of its attributes^M
*^M
* PROGRAMMER: Ale Gicqueau^M
*******************************************************************************/^M
^M
%macro attrn(ds,attrib);^M
^M
%local dsid rc;^M
^M
%let dsid=%sysfunc(open(&ds,is));^M
%if &dsid EQ 0 %then %do;^M
%put ERROR: (attrn) Dataset &ds not opened due to the following reason:;^M
%put %sysfunc(sysmsg());^M
%end;^M
%else %do;^M
%sysfunc(attrn(&dsid,&attrib))^M
%let rc=%sysfunc(close(&dsid));^M
%end;^M
^M
%mend;

“^M” is carriage return, which is missed in the last line. The above-mentioned two approaches do not count the last line in this and similar case.

Powershell We discussed the issue of “zero installation programming” before (see here and here). If you are using Windows 7 then the following one-liner script is ready to go:

$c=0;foreach ($x in ls -r | where {$_.extension -eq ".sas"} ){
$c=$c+$(get-content $x.FullName|measure).Count
}; $c

CDISC Express Contest – Latest News

Jun 15, 2011 by     No Comments    Posted under: CDISC, Code, Monthly Contest

1. Given the requests we have received, the deadline for the CDISC Express Mapping Contest has been extended to July 15th.

You still have a chance to participate and win an iPad2! Your challenge is to create a mapping file to map the source data set provided on this page to the SDTM DM domain using CDISC Express.

Here are the rules http://www.clinovo.com/cdisc/game

2. Jian’s challenge is still running. You have until July 1st to count the lines of all SAS source code of CDISC Express. Jian is waiting for your answers!
Send us your mapping file and get a chance to win a Fry’s gift card.

Programming Challenge: Source line counting

Jun 5, 2011 by     1 Comment     Posted under: Monthly Contest, New Technologies

The winner of our third programming challenge is Kalyani Chilukuri. She got all 92 solutions to Eight-Queens-Puzzle by using SAS. Congratulation!

This time I want to do more on tree-traversal. And the question is actually already sent to an sas programmer mail list I am maintaining.

  • Option I: Take “CDISC Express“. How many lines of SAS source code are there in the package? If you haven’t installed “CDISC Express”, you can download it from here.
  • Option II: Take “C:\Program Files\SAS\” which is the default SAS installation folder. Count the lines of all SAS source code going with your SAS installation.

This challenge is actually quite relevant–from time to time when you market a new SAS package, you may want to tell your prospect the total number of source lines as a good measure of the scale of your product.

Please send your code to me, be it in SAS or not :) The deadline is July 1, 2011 and the prize is a $25 gift card from either Macy’s or Fry’s.

Programming Challenge: Solve Eight Queens Puzzle by SAS

May 7, 2011 by     1 Comment     Posted under: Monthly Contest, New Technologies

Megha has kept her momentum to become the winner of the second programming challenge: Web Crawling (L1). Congratulation, Megha :) !

About web crawling, we’ve done with single page mining, and cross-page mining. The next step is to move to a single domain mining. However, to do that, we’ve got to make some preparation. That is where “Eight queens puzzle” comes from.

By definition from wikipedia,

The eight queens puzzle is the problem of placing eight chess queens on an 8×8 chessboard so that no two queens attack each other. Thus, a solution requires that no two queens share the same row, column, or diagonal.

So our challenge is: Write in SAS to solve eight queens puzzle. Please follow an output format: for the example shown in the figure below, the solution is expressed as 24683175. So every solution is written as an eight-digit number: The 1st digit from the left represents the queen’s rank position in file a, the second the queen’s in file b, et cetera.

Solid white.svg a b c d e f g h Solid white.svg
8 {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} white queen {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ 8
7 {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} white queen {{{square}}} __ 7
6 {{{square}}} __ {{{square}}} __ {{{square}}} white queen {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ 6
5 {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} white queen 5
4 {{{square}}} __ {{{square}}} white queen {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ 4
3 {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} white queen {{{square}}} __ {{{square}}} __ {{{square}}} __ 3
2 {{{square}}} white queen {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ 2
1 {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} __ {{{square}}} white queen {{{square}}} __ {{{square}}} __ 1
Solid white.svg a b c d e f g h Solid white.svg

Please send your code to me. The deadline is June 1, 2011 and the prize is a $25 gift card from either Macy’s or Fry’s.

Check out the BioNews, a very handy daily recap of the latest industry news!