Crafting Custom Facts

greenlight allows you to provide your own facts about your datasets.

Using the Fact class

A Fact class is provided for you to obtain simple facts via a SAS query.

Let’s say we want a custom record count for one of our tables.

Example: Custom Count using Fact

from greenlight import Fact

# Step 1.) Declare the fact.
#
# Here we're asking how many records in ff.factors_daily have smb set to null.
custom_count = Fact(
    'smb_null_count',
    'count(*)',
    path_to_lib='/wrds/ff/sasdata',
    dsn='factors_daily',
    where='smb = .'
)

# You can also use keyword arguments to clarify intent.
custom_count = Fact(
    factname='smb_null_count',
    fact='count(*)',
    path_to_lib='/wrds/ff/sasdata',
    dsn='factors_daily',
    where='smb = .'
)

# Step 2.) Obtain the fact
custom_count.obtain()
{
    'dev_smb_null_count': 0,
    'prod_smb_null_count': 0,
}

Fact returns a dictionary (keyed by the provided factname) and the fact test result as applied to both dev and prod.

Using the SAS_Fact class to run custom SAS

A SAS_Fact class is provided to allow you to collect results from you custom SAS scripts.

Let’s you’ve written a script that does some analysis like so:

proc sql noprint;
    create table analysis as
    select distinct a.value, a.measure
    from dictionary.tables a
    join sometable b
    on a.value = b.measure
    where <some condition holds>;

    select count(*) from analysis into :output_value;

How can you use this with SAS_Fact?

Step 1: Include the stdout macro.

%include "/wrds/lib/utility/stdout.sas";

Step 2: Output the value in question using %stdout().

Our altered SAS script now looks like:

%include "/wrds/lib/utility/stdout.sas";

proc sql noprint;
    create table analysis as
    select distinct a.value, a.measure
    from dictionary.tables a
    join sometable b
    on a.value = b.measure
    where <some condition holds>;

    select count(*) from analysis into :output_value;

%stdout(&output_value);

Now you can use SAS_Fact to obtain this value during fact gathering.

First, test interactively.

from greenlight import SAS_Fact

my_custom_fact = SAS_Fact(
    'my_custom_fact_key',
    '/wrds/some_product/utilities/custom_fact.sas',
)

my_custom_fact.obtain()
{'my_custom_fact_key': <result from SAS>}


# You can also specify an output type for the SAS-derived value.
#
# Use Python's built-in types: int, float, str...
my_custom_fact = SAS_Fact(
    'my_custom_fact_key',
    '/wrds/some_product/utilities/custom_fact.sas',
    output_type=int
)

Once you are satisifed, add it to your custom_facts list.

custom_facts.append(
    SAS_Fact(
        'my_custom_fact_key',
        '/wrds/some_product/utilities/custom_fact.sas',
    )
)

Custom facts show up in the collected_facts dictionary and are keyed by custom_facts.

The custom_facts item is a dictionary. Custom facts will be keyed by the provided factname. This is true of instances of Fact as well as SAS_Fact.

Passing data into SAS_Fact at Runtime (via initstmt)

You can pass data into your custom SAS script at runtime. You can pass along macro variables as key/value pairs when declaring SAS_Fact.

from greenlight import SAS_Fact

#
# This gets passed to SAS as "%let var1=123456;"
#
>>> f = SAS_Fact('demo_fact', 'initstmt_example.sas', var1=123456)

>>> f.obtain()
{'demo_fact': 123456}