Data adapters

The adapter class converts the source data as a Pandas data frame that is a compatible format for MTLDP functions

This is an example adapter for the Wejo data source:

class WejoTrajectoryAdapter(TrajectoryAdapterBase):
    def __init__(self):
        self.dtype_dict = {"DtEpochTS": np.float64,
                           " latitude": np.float64,
                           " longitude": np.float64,
                           " speed": np.float64,
                           " heading": np.float64}
        self.attribute_map = {"dataPointId": "point_id",
                              "journeyId": "trip_id",
                              "capturedTimestamp": "datetime",
                              "DtEpochTS": "timestamp",
                              " latitude": "latitude",
                              " longitude": "longitude",
                              " speed": "speed",
                              " heading": "heading",
                              " ignitionStatus": "trip_status"}

    def load(self, file_list: list):
        df_from_each_file = (pd.read_csv(f, dtype=self.dtype_dict,
                                         parse_dates=["capturedTimestamp"]) for f in file_list)
        df = pd.concat(df_from_each_file, ignore_index=True)
        df = df.rename(columns=self.attribute_map)
        df = df.sort_values(by=['trip_id', 'timestamp'])
        df = df.reset_index(drop=True)
        return df

Depending on the data source, the user may wish to load points from a single trajectory (mtldp.mtltrajs.Trajectory) or points from a collection of trajectories (mtldp.mtltrajs.TrajectoryTable). The proper argument configurations are specified in the documentation.

Standard Data Format and Naming Convention (Compatibility)

The adapter should convert the source data into a compatible format. The MTLDP has standard attribute naming conventions and data formats. The Adapter class __init__ function should specify all the necessary mapping dictionaries for conversion into the these standard formats. For example, the data source above codes the timestamp of a certain trajectory point as “DtEpochTS.” In the MTLDP, the timestamp should be labelled as “timestamp,” so the mapping dictionary reflects this change

A list of trajectory attributes should be specified in the attributes.py file so that the user can perform analysis at the trajectory level on information that is uniform to the entire trajectory. The data source for the adapter shown above does not include any point level information uniform to the entire trajectory. This should be represented in the attributes.py file as:

WEJO_TRIP_ATTRIBUTES = []

However, other data sources may include trajectory attributes such as vehicle length and width.

AACVTE_TRIP_ATTRIBUTES = ['veh_width', 'veh_length']

Data Loading

The Adapter class should also include a load(data_path) function. This function will return the processed trajectory information in a Pandas data frame. In the example above, the load function reads a number of csv input files, and creates a Pandas data frame in the necessary format utilizing the initialized naming configuration dictionaries.