New Course on CDISC Standards
Clinical information tends to be more complex, comes from multiple sources in different formats. As a result, clinical data submission has become time-consuming, costly and error-prone. CDISC® (Clinical Data Interchange Standards Consortium) established new data standards to speed up data-review and improve clinical data exchange, storage and archival. Our technology edge combined to our experience in standards implementation allows us to develop tailored CDISC solutions to accelerate your FDA review. Clinovo introduced a new opportunity to learn these recognized clinical data standards!
Clinovo’s new “CDISC Standards: Theory and Application” class is an 8-week training program starting in June 11th, 2013. The TechTrainings are technical hands-on classes for entry-level or experienced clinical trial professionals designed to help them reach the next step in their professional career. The class will be held in Palo Alto at Dentons Offices or remotely.
Taught by Sy Truong, President at Meta-Xceed and author of award-winning papers, this new course will give an overview of CDISC standards: ODM, SDTM, ADaM and Define.XML. Students will learn how to transform legacy data into these clinical standards through real-life examples. Case studies will include data exchange, archival, and electronic submission to regulatory agencies such as the FDA.
Clinovo will continue to offer the “Base Clinical SAS Programming” class to help entry-level programmers prepare the Base SAS certification, as well as the “Advanced Clinical SAS Programming” class to tackle advanced real-world SAS programming challenges. Clinovo offers $50 gift cards for referrals.
More information on the class can be found on clinovo.com/techtrainings.
Olivier Roth launched the TechTrainings by Clinovo in 2012, a series of hands-on courses for clinical trial professionals, leveraging his company’s years of on-field experience and industry expertise. He is the Marketing & Communication Coordinator at Clinovo, a CRO based in Sunnyvale, focused on streamlining clinical trials for life science companies through technology solutions. Olivier helps managing Clinovo’s marketing and communication from marketing strategy to partnership management, lead generation, event planning and new business opportunities. Prior to Clinovo, Olivier was working as a Strategic Marketing Consultant at VivaSante, an international consumer healthcare company based in Paris.
Are you prepared for CDISC?
CDISC® (Clinical Data Interchange Standards Consortium) is establishing data standards to speed up data-review and improve clinical data exchange, storage and archival. Today, 60% of FDA submissions are already done in CDISC standards. The FDA is getting more and more involved into CDISC standards, a meaningful signal for the industry. Theresa Mullin, Director of Office of Planning and Informatics within CDER, claimed that “the FDA is committed to using CDISC standards for the foreseeable future”. These data standards are expected to be mandatory by 2016 for every drug submission.
CDISC standards hold the clinical data to a greater level of readability and compliancy in regards to FDA requirements. Carey Smoak, Senior Manager of SAS Programming at Roche Molecular and CDISC Device Team Leader, points out that “a submission without CDISC standards can have a review period twice as long as one under standards”. Indeed, they facilitate the FDA review process since they are known and understood by reviewers.
A 2009 study conducted by Gartner in collaboration with the CDISC organization shows that the overall clinical trial duration is divided by two when using CDISC standards. Thus CDISC standards ultimately speed up time to market.
So if the benefits of using CDISC standards are so obvious, how can we explain that so many sponsor companies are still not adopting them?
Converting legacy data to CDISC standards is expensive
Clinical data standardization is no simple process: It is time consuming and proves to be tedious. However, a few open source CDISC conversion tools have been launched to address this problem. One successful example is the OpenCDSIC validator software, recognized by the FDA and freely available. CDISC Express, Clinovo’s free SAS-based SDTM mapping tool, has been downloaded 600 times.
In the future, standards can be adopted smoothly if the industry works harder at incorporating them earlier in the process. Indeed, the next challenge is to push CDISC standards upfront in the clinical trial process. CDISC experts agree the best timing to implement CDISC standards is the database built.
CDISC standards are still evolving
Standards are still being built and are in constant evolution. The CDISC organization is still releasing new versions of its clinical standards. Sponsors companies are often scared that if they convert their clinical data to a format, it will be obsolete a year later. Clinical trial experts still state however that sponsor companies should shift to CDISC standards as soon as possible.
Companies often lack the internal expertise
In order to be efficiently used and maintained, Carey Smoak points out that “the wiser choice is to hire people with expertise on CDISC standards”. Companies should educate themselves on this topic and exclusively hire experts from CDISC Registered Solutions Providers organizations.
Ale Gicqueau, CEO at Clinovo
Open Source Technologies for Clinical Trials
Clinical trials have become increasingly complex and, as a result, costly. Only 333 drugs and biologics have been approved between 2000 and 2010 due to stricter regulatory procedures while spending has increase by 15 in the same period of time.
The need for innovation is critical in the pharmaceutical and biotechnology industry. Life science companies and service providers are looking for innovative solutions to improve study performance and minimize their risks. This article will demonstrate how open source technology presents an innovative solution to this challenging environment, and ultimately helps bring medical innovations faster to patients.
What is open source? Open source is a type of software license. There are various types of open source licenses, but the common characteristic to all is allowing free distribution of the underlying source code. Famous open source systems include Linux, Apache, MySQL, and many others. Below is a definition of Open-Source Software:
- Free redistribution
- Source code
- Derived works
- Integrity of the author’s source code
- No discrimination against persons or groups
- No discrimination against fields of endeavor
- Distribution of license
- License must not be specific to a product
- License must not restrict other software
- License must be technology-neutral
Taken from Opensource.org. See http://opensource.org/docs/definition.php for an annotated description of the above points.
Open Source in the Clinical Trial Industry. Two pioneers in open source technology for clinical trials are Cynthia Brandt and Prakash Nadkarni of the YaleCenterfor Medical Informatics, with their TrialDB system (http://ycmi.med.yale.edu/trialdb/), an open-source Clinical Study Data Management System (CSDMS) for the storage and management of clinical data initiated in the 1990’s.
The US National Cancer Institute launched a wide-ranging, open-source friendly initiative named CaBIG (Cancer Biomedical Informatics Grid - https://cabig.nci.nih.gov), that aims to develop a collaborative information network to accelerate the detection, diagnosis, treatment, and prevention of cancer.
Open source software is also used for electronic data capture (OpenClinica, ClinCapture), clinical research (LabKey Server), Electronic health or medical record (OpenEMR), analysis (R project), and CDISC conversion (CDISC Express, OpenCDISC).
Benefits of open source technologies for clinical trials. While open source is prevalent in many industries, this technology is still emerging in the field of clinical trials. The development of open source technology in the clinical arena has been quickly growing. Eric Morrie, Manager for Clinical Programming in one of the worldwide leading medical device companies, shared his extensive experience on open source technologies at a Silicon Valley BioTalks (http://www.clinovo.com/biotalks/open-source/article). Eric explained how open source technologies save time, improve re-usability and simplify the customization of systems to a company’s needs.
- Provide state-of-the-art, cost-effective solutions
Proprietary systems for clinical data management are often too expensive for individual researchers and smaller companies. As a result, they often use slow, error-prone paper-based methods.
Ale Gicqueau, President and CEO of Clinovo, a CRO based in the Bay Area, explains that with open source technologies, the license fee for proprietary systems is no longer a barrier entry for small and mid-size companies (http://www.clinovo.com/biotalks/open-source/article). Open source clinical data management systems save money by eliminating the reliance of using expensive proprietary systems, while insuring the same levels of quality. It provides a means for smaller companies to access high quality technologies for clinical data management and comply with international regulatory standards.
- Avoids the risks of vendor lock-in
Proprietary systems lock a customer into a vendor’s product from which they cannot escape without substantial switching costs. Such dependence includes reliance for maintenance and support, and the necessity to accept version upgrades that the buyer may not need.
Widely adopted open source systems on the other hand have multiple vendors supporting it. Surveys demonstrate that early adopters of open source technologies are driven by the “reduced dependence on software vendors”, often seen as one of the most important advantages of open source technology.
- Enables a larger community to maintain and enhance the source code
The open source model enables quick improvements by giving access to the underlying source code to a large community of talented developers. In the open source community, developers are encouraged to produce derived works to enhance the existing source code.
“The Open Source community attracts very bright, very motivated developers”, explains the UK software consultancy company GB Direct (http://open-source.gbdirect.co.uk/migration/benefit.html). “Highly prized factors are clean design, reliability and maintainability, with adherence to standards and shared community values preeminent.”
A rising trend: Open source for electronic data capture: One of the most famous and number one open source system for clinical trials is OpenClinica, with a community of over 12,000 developers.EDC systems are often prohibitively expensive, ranging in the hundreds of thousands of dollars. As a result, open source technology has been particularly well-received in the field of electronic data capture. Open source EDC platforms deliver the same benefits as proprietary EDC systems but without the license fee.
The one I am most familiar with is an open source EDC system developed by Clinovo : ClinCapture. It is a validated and enhanced version of the #1 open source EDC platform, fully customizable to any clinical study. Learn more
This open source EDC system has been successfully implemented by major pharmaceutical, medical device and biotech companies. Victor Chen, Director of Clinical Affairs at Intuitive Surgical, explains that he decided to use this technology due to the low price and the flexibility that suits adaptive clinical trials. However, he emphasizes on the importance of rigorously assessing any open source system vs. proprietary systems and evaluating the cost for validation. Read case study
The emergence of open source based tools for CDISC: Converting clinical data to the widely recognized CDISC SDTM standard is often done manually, which can quickly become tedious, error-prone, and time-consuming.CDISC SDTM data is the standard format recommended by the FDA for clinical trial data submission. The mission of CDISC is to develop and support global, platform-independent data standards to improve medical research.
Clinovo developed an open-source system to help with this conversion to CDISC SDTM: CDISC Express. CDISC Express is a powerful open source SAS®-based system that automatically converts clinical data into CDISC SDTM using an Excel framework.
The CDISC Express framework is highly extensible. The system significantly speeds-up CDISC SDTM conversion, and has been successfully implemented for major biotechnology and pharmaceutical companies. Download for free
Conclusion: Today, it takes on average 10 to 15 years to develop a drug and costs near $1.2 billion. With only 2 of 10 marketed drugs returning revenues that match or exceed R&D costs, developing medical innovations has become more and more risky. Open source technologies are an innovative way to lower the cost of clinical trials and minimize risk, while ensuring the same level of quality as proprietary systems.
Ultimately, open source technologies increase the scope and variety of clinical trials, by enabling smaller institutions to pursue their clinical research that would otherwise be out-of-reach and beyond financial capacity. “We believe that an open-source approach has the best chance of ensuring that all kind of groups can be involved with the development of systems that have bearing on global public health”, explains Greg W. Fegan and Trudie A. Lang in their featured article Could an Open-Source Clinical Trial Data Management System Be What We Have All Been Looking For?
Read our latest white papers now
References
- Silicon Valley BioTalks, June 2011 : http://www.clinovo.com/node/129
- “Could an Open-Source Clinical Trial Data Management System Be What We Have All Been Looking For?”, By Greg W. Fegan and Trudie A. Lang, March 4, 2008
- “Overcoming Obstacles To Successful Clinical Trials through Open Source”, by Benjamin Baumann, Nov 10, 2011
- 2011 profile, PhRMA Pharmaceutical Industry
- Opensource.org
- https://cabig.nci.nih.gov
- http://ycmi.med.yale.edu/trialdb/
- http://open-source.gbdirect.co.uk/migration/benefit.html
- Health Decision Webinar “Top 10 Benefits of Adaptive Design”, Jan 25, 2011
Download
- CDISC Express: www.clinovo.com/cdisc
- ClinCapture brochure: http://www.clinovo.com/resources/brochures/clincapture
- Clinovo case studies on open source systems: http://www.clinovo.com/case_studies
PharmaSUG 2012: A great conference!
This was the third consecutive year Clinovo attended PharmaSUG after Orlando in 2010 and Nashville in 2011. This time, we had the “home” advantage and this gave us the ability to bring many co-workers to this year conference. This year conference was one of the best PharmaSUG ever according to many of the participants. Everyone loves to visit San Francisco despite Mark Twain’s famous sentence: “The coldest winter I ever spent was a summer in San Francisco.” The organization was impeccable and the food succulent. The only quack was that a thief was able to get to the exhibit floor during lunch to steal equipment from exhibitors.
A New Way to Collect Data: CDASH by Anayansi Gamboa

There is a general consensus that the old paper-based data management tools and processes were inefficient and should be optimized. Electronic Data Capture has transformed the process of clinical trials data collection from a paper-based Case Report Form (CRF) process (paper-based) to an electronic-based CRF process (edc process).
In an attempt to optimize the process of collecting and cleaning clinical data, the Clinical Data Interchange Standards Consortium (CDISC), has developed standards that span the research spectrum from preclinical through postmarketing studies, including regulatory submission. These standards primarily focus on definitions of electronic data, the mechanisms for transmitting them, and, to a limited degree, related documents, such as the protocol. Read more »
CDISC Releases New Protocol Representation Toolkit
Austin, TX – 18 April 2012 – The Clinical Data Interchange Standards Consortium (CDISC) is pleased to announce today at the CDISC European Interchange Conference in Stockholm, Sweden, the release of the first iteration of a Protocol Representation “Toolkit” for clinical research. The purpose is to make it easy for authors of the research plan or protocol to reap the benefits of the Protocol Representation Model (PRM), which has been developed over the past decade by global clinical research experts from academia, industry and government. Using such a model can save time and resources for research studies by enabling electronic re-use of protocol information for other purposes such as clinical trial registration, study tracking, regulatory information and study reports. The current release of the “Toolkit” includes a standard Study Outline Template in MS Word format, a standard list of Study Outline Concepts, and a complete mapping of the Study Outline Concepts to both the Biomedical Research Integrated Domain Group (BRIDG) model and the CDISC Study Data Tabulation Model (SDTM) Trial Summary (TS) Domain. Read more »
How CDISC standards streamline clinical trials
On March 7th 2012, Clinovo hosted the third Silicon Valley Biotalks, hosted by SNR Denton in their Palo Alto offices. The event welcomed over 60 professionals from the biotechnology and pharmaceuticals industry. The panel was composed of top-tier CDISC experts:
- John Brega (PharmaStat) CDISC Implementation and eCTD Submissions
- Carey Smoak (Roche Molecular) Senior Manager, SAS Programming and CDISC Device Team Leader
- Dave Borbas (Jazz Pharmaceuticals) Senior Director, Data Management
- Ale Gicqueau (Clinovo) President & CEO

- Carey Smoak made the point that the medical device world is on the move to implement CDISC standards. One has to be aware that the simple fact in putting data in an electronic database is quite new for some medical devices companies. Combination products in which medical device interact with a drug is favorable to the implementation of CDISC standards. Indeed, the medical device world is learning from the experience of the drug industry on CDISC standards.
- John Brega mentioned that 60% of FDA submissions are already done in CDISC standards. He also stated that smaller pharma companies adopt CDISC standards faster. Indeed, a lot of the bigger companies have already developed in-house standards and even though they see the benefits of CDISC standards, it takes money and time to transform their processes and change their habits. On the other hand, smaller players that have no or few standards in place are more prone to start using FDA approved CDISC standards.
- One of the advantages of CDISC standards is that it holds the clinical data to a greater level of readability and compliancy in regards of FDA requirements. A submission without SDTM can have a review period twice as long as one under SDTM standards. CDISC conversion allows submitters to find out problems or discrepancies even before the FDA does, which gives more data consistency and confidence for the FDA submission. It saves time and frustration on both sides.
- Carey Smoak said that the earlier the CDISC standards were utilized, the better. The best timing to implement CDISC standards is the database built. There is a real challenge to push CDISC standards upfront in the clinical trial process. He recommended to hire people with expertise on CDISC standards. One should educate themselves on this hot topic and only hire real experts and approved service providers.
- Dave Borbas mentioned existing CDISC conversion tools, such as the open CDSIC validator software, recognized by the FDA and freely available. He talked about CDISC Express, Clinovo’s free SAS-based SDTM mapping tool. Carey Smoak mentioned the development of new softwares, and that applications on smartphones are also mushrooming, but stated that quality, validation and compliancy were very tricky in this case and an upcoming challenge.
- Dave Borbas said that the FDA is getting more and more involved into CDISC standards. FDA calls it “their” standards now, which is new and great signal to the CDISC community.
Do you want to learn more on CDISC conversion?
Register for free for our next Webinar on March 28th at 9am:
CDISC® SDTM Conversion Made Easy with CDISC Express
Dive into CDISC Express (5) : Generate and validate SDTM domains and define.xml
Here is the fifth part of Dive into CDISC Express.
Part one | Part two | Part three | Part four
The following tasks, such as generating SDTM domains and define.xml, need just some clicking button work in CDISC Express using a well designed mapping file. Few words needed due to the software.
Step 3 of 6: Validate mapping file (Validate_Mapping_File.sas)
It would be back and forth to design, validate then modify and re-validate the mapping file. And sure finally, you will get all the work done, at least no syntax error (how to avoid semantic errors is upon your domain knowledge). A validated mapping file, named mapping.xls will be copied to …\ doc\Mapping file – validated version\ from the working file, tmpmaping.xls. You will see
The corresponding log file in folder …\ log\
A report in …\results\Mapping Validation\, named Mapping_validation.html
Also the temporary datasets in …\tempdata\ and …\temp\:
Step 4 of 6: Generate SDTM datasets (generate_SDTM.sas)
If mapping file is OK, generating SDTM domains is just clicking the button. After submitting the codes, you will see the log file, reports, SDTM datasets and temporary datasets in corresponding folders:
Step 5 of 6: Validate SDTM datasets (Validate_SDTM_Domains.sas)
The outputs files of validating SDTM datasets are all located in C:\Program Files\CDISC Express\SDTM Validation\:
Step 6 of 6: Generate Define.xml and xpt (generate_Definexml.sas)
Get the final define.xml file and SAS transport files (.xpt):
Recommended reading and action taken
For a quick start and deep understanding, you could read the official documentations in the following sequence:
C:\Program Files\CDISC Express\documentation\FAQ.htm
C:\Program Files\CDISC Express\documentation\Quick Start.htm
C:\Program Files\CDISC Express\documentation\User guide.htm
A video tutorial would be also helpful:
C:\Program Files\CDISC Express\documentation\videotutorial.htm
A must-read conference paper, An Excel Framework to Convert Clinical Data to CDISC SDTM Leveraging SAS Technology by Sophie McCallum and Stephen Chan of Clinovo, supplies a wonderful discussion the architectures of CDISC Express:
http://www.lexjansen.com/pharmasug/2011/ad/pharmasug-2011-ad08.pdf
Dive into CDISC Express (4) : Data manipulation techniques
Here is the fourth part of Dive into CDISC Express.
Part one | Part two | Part three
3. Data manipulation techniques in CDISC Express
CDISC Express supplies relative rich sets of data manipulation techniques assembling with SAS languages used for data mapping. Following is a not limited listing and I will keep it updated.
3.1 Reference one dataset
A raw dataset name appear in “Dataset” column indicate a “set” operation in SAS.
All dataset options can be used when referencing a dataset, such as
siteinv(drop=invcode)
siteinv(rename=(invcode=inv))
siteinv(where=(invcode ne “”))
You can also reference an external dataset. You should incorporate the external file in spreadsheet with name beginning with an underscore, “_”, and “_visits” in this case:
Then you can use it in any domains needed, e.g., TV domain:
There is a macro %cpd_importlist used to import the external dataset, “_visits”. Again, this macro roots in C:\Program Files\CDISC Express\macros\function_library\.
Using a macro call to re-sharp or modify an input dataset offers great flexibility referencing data. We will also discuss the benefits later on.
3.2 Assignment
You can assign a number, string and a dataset variable with any valid SAS functions to a SDTM domain variable in “Expression” column.
Sometimes a temporary variable needed for later calculation. You can produce such temporary variable in “Dataset” column with an assignment in the “Expression” column just similar with any other domain variables. Two differences: first, such temporary variables named begin with an asterisk, “*”; second, all temporary variables will not be included in the final domain. Once created, such temporary variables can be used for any other expressions.
There are three special symbols used in “Dataset” column of CDISC Express. Asterisk, “*” indicates a temporary variable, while other two are
Tilde, “~” : indicate a variable used for supplemental domain (SUPPQUAL).
Number sign, “#”: indicate a variable used for comments domain (CO).
Another symbol, at sign, “@”, used in “Expression” column, indicated referencing a variables produced before:
In this case, “AGEU” uses “AGE” as input, while “AGE” is calculated before. “@AGE” just indicates the dependency. In concept, it looks like the “calculated” option in SAS PROC SQL:
proc sql ;
select (AvgHigh – 32) * 5/9 as HighC ,
(AvgLow – 32) * 5/9 as LowC ,
(calculated HighC – calculated LowC)
as Range
from temps;
quit;
3.3 Match-merging
We already got a math-merging example before. If “all” appears as a dataset in the “Dataset” column, all the previous datasets should be merged first for later processing by the common key specified in “Merge Key” column. If no key assigned, patient ID is used by the system.
CDISC Express also supports two types of join, inner join and outer join (left, right, full) using data steps. The implementation has slightly difference with standard SQL, but the ideas are same.
We add a new column, “Join”, usually beside the “Merge Key” column.
There are two values for “Join”, “O” or “I” while “O” stands for “outer join” and “I”, “inner join”. A join indicator “I” equals a dataset option “in=” in action while “O” means no. Use the above as illustration, the corresponding SAS codes behind look like
data temp;
merge demog(in=a) siteinv(in=b);
by sitecode;
if b;
run;
This is so called “right outer join”. The combination of “I” and “O” in these two datasets can perform all the four types of join, one inner join and three outer join:
As we could see, if no “Join” column specified, CDISC Express will perform inner join by default.
So far CDISC Express cannot support multiply merge keys. For example, the following file is illegal currently:
| Dataset | Merge Key |
| arm | siteid, grpno |
| armdescri | siteid, grpno |
The developer Romain indicated that such enhancements would be raised to the next round of product road map and he also proposed a work around. To use multiple keys for merging, we can create a temporary variable holding such multiple keys as a concatenation then this temporary variable can be used as a single merging key.
3.4 Concatenating
Above we discussed lots about “merge” operation in CDISC Express. This section dedicated for “set” operation. We already know how to “set” one dataset for referencing, but how to “set” multiple datasets, i.e, “Concatenating”?
Symmetrically, an “all” appears in “Dataset” column indicating merging operation, an “all (stack)” indicates concatenating operation:
The above file can be also translated to SAS codes for better understanding:
set vtsigns(where=(height ne .));
VSTESTCD=”HEIGHT”;
VSTEST =”Height”;
VSORRES =put(height,best12.);
VSORRESU=”cm”;
VSSTRESC=put(height,best12.);
VSSTRESN=height;
VSSTRESU=”cm”;
run;
data weight;
set vtsigns(where=(weight ne .));
VSTESTCD=”WEIGHT”;
VSTEST =”Weight”;
VSORRES =put(weight,best12.);
VSORRESU=”kg”;
VSSTRESC=put(weight,best12.);
VSSTRESN=weight;
VSSTRESU=”cm”;
run;
data vs;
set height weight;
STUDYID =study;
DOMAIN =&domain;
USUBJID =%CONCATENATE(_variables=study sitecode patid);
VSSEQ =%SEQUENCE();
. . .
run;
3.5 Transpose
Clinical SAS programmers do lots of transpose operation to re-sharp the raw data to fit the CDISC standards. Currently there is no explicit guide in CDISC Express on how to transpose, but this is not the end of story.
There are two types of transpose:
Type I: from a wide dataset (more variables, less observations) to a long dataset (less variables, more observations), e.g. transposing a one-row-per-subject datasets to a multiple-row-per-subject dataset
Type II: from a long dataset (less variables, more observations) to a wide dataset (more variables, less observations), e.g. transposing a multiple-row-per-subject dataset to a one-row-per-subject datasets
As good practices, in SAS we always use data steps with “output” statement to perform type I transpose and use PROC TRANSPOSE for type II. Although CDISC Express doesn’t support transpose operation in an explicit way, at least you can perform type I transpose and surprisingly we already saw it before!
Just back to section of concatenating. The example is taken from C:\Program Files\CDISC Express\studies\example2\.
We can see the input data vtsigns is typical wide table (more variables, less observations):
And the final domain VS is a typical long table (less variables, more observations):
So obviously, such concatenating operation just did a wonderful type I transpose, from a wide table to a long table! More often, the compact SAS codes for type I transpose look like:
data vs;
set vtsigns;
if height ne . then do;
VSTESTCD=”HEIGHT”;
VSTEST =”Height”;
VSORRES =put(height,best12.);
VSORRESU=”cm”;
VSSTRESC=put(height,best12.);
VSSTRESN=height;
VSSTRESU=”cm”;
output;
end;
if weight ne . then do;
VSTESTCD=”WEIGHT”;
VSTEST =”Weight”;
VSORRES =put(weight,best12.);
VSORRESU=”kg”;
VSSTRESC=put(weight,best12.);
VSSTRESN=weight;
VSSTRESU=”cm”;
output;
end;
. . .
run;
3.6 All others: use macro!
Now we discussed almost all the common data derivation techniques in programmers’ daily life and the corresponding implementation in CDISC Express. At least we have one question unsolved: how to perform type II transpose, i.e. from a long table to a wide table?
It would be an open question for the developers of the application. But we can also solve this problem in current framework: use macro, customized macro. You can use macros in “Expression” and “Dataset” column. Macro used in “Dataset” column returns a dataset, while macro in “Expression” column returns series of string: that’s the basic structure you should consider when customize your own macros. For more, you can reference the macros in C:\Program Files\CDISC Express\macros\function_library\. For example, &concatenate used in “Expression” column; &cpd_importlist in “Dataset” column.
So it would be convenient to create temporary datasets using macros imbedded type II transpose operation in “Dataset” column. Every thing SAS can do, you can also implement it in CDISC Express. Just use macros, in “Expression” and “Dataset” column accordingly.
The raw data varies according to trial design and clinical data capture system and procedures. It is impossible and impractical to anticipate the CDISC SDTM converter such as CDISC Express to map all the data just clicking a button. The introducing of CDISC Express doesn’t keep programmers away. It just keeps most of the trivial work away from programmers’ daily life and let them more concentrated on creative work and be productive and efficient.
Following would be the close of such pages.
Dive into CDISC Express (2) : Create a new study
Here is the second part of ‘Dive into CDISC Express’, written by Jiangtang Hu. You can read the first part here.
Step 1 of 6: Create a new study (create_new_study.sas)
Open create_new_study.sas in C:\Program Files\CDISC Express\programs\, you can see only one line of a macro call:
%addnewstudy(studyname=my new study);
Just assign a study name to the macro variable, &studyname, e.g, “CLINCAP”:
%addnewstudy(studyname= CLINCAP);
Submit the codes, you can find a folder named “CLINCAP” with the same structure as the two demo studies imbedded in this application(example1 and example2) in C:\Program Files\CDISC Express\studies\, see(the left and right panels are folders and files before and after the execution of create_new_study.sas. The following the same):
![]()
Folder ‘doc’ is used to hold the mapping files;
Folder ‘log’ used to hold log files generated by following macro calls, such as generate SDTM domains;
Folder ‘results’ and its subfolder will hold all the outputs, such as define.xml, SAS transport file, validation reports and SDTM datasets;
Folder ‘source’ holds all the clinical raw data used as inputs for SDTM domains;
Folder ‘tempdata’ holds all the temporary datasets generated by following macro calls.
Also, a configuration file named CLINCAP_configuration.sas put in C:\Program Files\CDISC Express\programs\study configuration\. This file is used to set some study level parameters, such as lab and toxicity specifications (details in C:\Program Files\CDISC Express\specs\Lab specs\).
Two versions of SDTM implementation guides are supported by CDISC Express, CDISC SDTM Implementation Guide Version 3.1.1 and Version 3.1.2. You can find the corresponding specification files in C:\Program Files\CDISC Express\specs\SDTM specs\:
SDTM_Specs_3_1_1.xls
SDTM_Specs_3_1_2.xls
The choosing of SDTM implementation version is also coded in the configuration file, in Line 41:
%LET SDTMSPECFILE=SDTM_Specs_3_1_1.xls;
Version 3.1.1 is used by default. You can also choose Version 3.1.2 if needed:
%LET SDTMSPECFILE=SDTM_Specs_3_1_2.xls;
Assign a study name and choose a SDTM implementation version. That’s all needed in step 1. Let’s take few minutes to navigate the software. CDISC Express is a set of macros and Excel files. It is important to know the file structure.
C:\Program Files\CDISC Express\
├─documentation : FAQ, Quick Start, User Guide
├─macros
│ ├─ClinMap : system level macros
│ └─function_library : study level macros
├─programs : “action taken” macros
│ ├─study configuration : study parameters configuration, e.g, choose SDTM version
├─SDTM Validation : For validation of SDTM domains
│ └─study1
├─specs : specification files
│ ├─Excel engine : ExcelXP tagset file
│ ├─Lab specs : lab and toxicity
│ ├─Mapping validation : validation rules
│ ├─SDTM specs : hold two versions of SDTM implementation
│ └─SDTM Terminology : SDTM codelist(including NCI terminology)
├─studies
│ ├─example1
└─temp : hold temporary data not specified to any studies
As we already got, all the “action taken” programs such as create_new_study.sas are located in C:\Program Files\CDISC Express\programs\. In create_new_study.sas, one macro is called, %addnewstudy, which is in C:\Program Files\CDISC Express\macros\ClinMap\.
Note that in C:\Program Files\CDISC Express\macros\, there are two sets of macros in different folders:
C:\Program Files\CDISC Express\macros\ClinMap\: this folder holds all “system” level macros used by the application only. No modification encouraged.
C:\Program Files\CDISC Express\macros\function_library\: macros used for mapping among studies. You can also create you own macro in this folder. The application imbedded macros also documented in user guide.
Next part on the mapping file next week!
Categories
- Best Practices (3)
- Best-Practices (16)
- BioNews (3)
- Business Best Practices (5)
- Case studies (2)
- CDISC (11)
- Clinical Data Management (6)
- Clinical Stories (1)
- Code (13)
- EDC (7)
- Event (3)
- Events (7)
- Menu (3)
- Monthly Contest (12)
- New Technologies (15)
- OpenClinica (2)
- SAS Library (4)
- Scripting (2)
- Tips & Techniques (14)
- Trends (11)




Posted under: 



