Built-in Facts¶
greenlight provides a few built-in facts.
Percent_Null¶
Find the percent null for a given variable in the provided dataset (in both dev and prod).
pn = Percent_Null(
variable, # variable in question (Note: this is case-sensitive)
path_to_lib, # directory (ex: "/wrds/ff/sasdata")
dsn, # dataset (ex: "factors_monthly")
)
1.) Called directly – good for verifying behavior.
>>> pn = Percent_Null('PRICE', '/wrds/levin/sasdata', 'healthcare_ma')
>>> pn.obtain()
{'dev_healthcare_ma_PRICE_percent_null': 0.39591,
'prod_healthcare_ma_PRICE_percent_null': 0.39591}
2.) As a custom fact.
# Add the custom fact to your `custom_facts` list.
custom_facts.append(Percent_Null('PRICE', '/wrds/levin/sasdata', 'healthcare_ma'))
Min_Date¶
Find the minimum date value for the provided variable in the provided dataset (in both dev and prod).
min_date = Min_Date(
date_variable, # date variable in question (Note: this is case-sensitive)
path_to_lib, # directory (ex: "/wrds/ff/sasdata")
dsn, # dataset (ex: "factors_monthly")
date_format='yymmdd10.', # [Optional] date output format (yymmdd10. by default)
)
1.) Called directly – good for verifying behavior.
>>> min_date = Min_Date('date', '/wrds/ff/sasdata', 'factors_daily')
>>> min_date.obtain()
{'dev_factors_daily_min_date': '1926-07-01',
'prod_factors_daily_min_date': '1926-07-01'}
2.) As a custom fact:
# Add the fact to your `custom_facts` list.
custom_facts.append(
Min_Date('date', '/wrds/ff/sasdata', 'factors_daily')
)
Optionally, you can set the date format for your output.
>>> Min_Date('date', '/wrds/ff/sasdata', 'factors_daily', date_format='yymmdd8.')
Max_Date¶
Find the maximum date value for the provided variable in the provided dataset (in both dev and prod).
max_date = Max_Date(
date_variable, # date variable in question (Note: this is case-sensitive)
path_to_lib, # directory (ex: "/wrds/ff/sasdata")
dsn, # dataset (ex: "factors_monthly")
date_format='yymmdd10.', # [Optional] date output format (yymmdd10. by default)
)
1.) Called directly – good for verifying behavior.
>>> max_date = Max_Date('date', '/wrds/ff/sasdata', 'factors_daily')
>>> max_date.obtain()
{'dev_factors_daily_max_date': '2018-06-29',
'prod_factors_daily_max_date': '2018-08-31'}
2.) As a custom fact:
# Add the fact to your `custom_facts` list.
custom_facts.append(Max_Date('date', '/wrds/ff/sasdata', 'factors_daily'))
Optionally, you can set the date format for your output.
>>> Max_Date('date', '/wrds/ff/sasdata', 'factors_daily', date_format='yymmdd8.')
Null_Variables¶
Discover which variables in a given table are composed entirely of missing values.
1.) Called directly – good for verifying behavior.
>>> null_vars = Null_Variables('/wrds/ff/sasdata', 'factors_monthly')
>>> null_vars.obtain()
{'dev_factors_daily_null_variables': [],
'prod_factors_daily_null_variables': []}
2.) As a custom fact.
# Add the custom fact to your `custom_facts` list.
custom_facts.append(Null_Variable('/wrds/ff/sasdata', 'factors_monthly'))