Data - building features (pynance.data.feat)

These functions are intended to be used in conjunction with functools.partial and other function decorators to pass to data.labeledfeatures(). For example,

>>> from functools import partial
>>> featfunc = pn.decorate(partial(pn.data.feat.fromfuncs, [fn1, fn2, fn3], skipatstart=averaging_window), 
        averaging_window + n_feature_sessions - 1)
>>> features, labels = pn.data.labeledfeatures(eqdata, featfunc, labelfunc) 
pynance.data.feat.add_const(features)[source]

Prepend the constant feature 1 as first feature and return the modified feature set.

Parameters:features : ndarray or DataFrame
pynance.data.feat.fromcols(selection, n_sessions, eqdata, **kwargs)[source]

Generate features from selected columns of a dataframe.

Parameters:

selection : list or tuple {str}

Columns to be used as features.

n_sessions : int

Number of sessions over which to create features.

eqdata : DataFrame

Data from which to generate feature set. Must contain as columns the values from which the features are to be generated.

constfeat : bool, optional

Whether or not the returned features will have the constant feature.

Returns:

features : DataFrame

pynance.data.feat.fromfuncs(funcs, n_sessions, eqdata, **kwargs)[source]

Generate features using a list of functions to apply to input data

Parameters:

funcs : list {function}

Functions to apply to eqdata. Each function is expected to output a dataframe with index identical to a slice of eqdata. The slice must include at least eqdata.index[skipatstart + n_sessions - 1:]. Each function is also expected to have a function attribute title, which is used to generate the column names of the output features.

n_sessions : int

Number of sessions over which to create features.

eqdata : DataFrame

Data from which to generate features. The data will often be retrieved using pn.get().

constfeat : bool, optional

Whether or not the returned features will have the constant feature.

skipatstart : int, optional

Number of rows to omit at the start of the output DataFrame. This parameter is necessary if any of the functions requires a rampup period before returning valid results, e.g. sma() or functions calculating volume relative to a past baseline. Defaults to 0.

Returns:

features : DataFrame

Previous topic

Data - combine features and labels (pynance.data.combine)

Next topic

Data - building labels (pynance.data.lab)

This Page