How can I handle relational data in machine learning classification? -
i trying classify incidents true positive , false positives using machine learning.
i have dataset of incidents every column describes attribute of incident. , there's list of alerts associated each incident. list of alerts can contain 0-10 alerts every alert row having details of alert. i.e there 1 many relationship between incident , alerts.
i have experience in classifying simple datasets have set of columns every row not sure how handle relational data this.
i using scikit-learn this.
as far understand data looks this:
incident table: id | i_attr0 | alerts 0 | foo | [alert0, alert1] ... alert table: id | a_attr0 alert0 | bar alert1 | baz ...
if case denormalize table like:
incident-alert table: id | i_attr0 | alert0 | alert0_a_attr0 | alert1 | alert1_a_attr0 | etc.. 0 | foo | true | bar | true | baz |
and work there.
Comments
Post a Comment