Register Data Family
Organize or group similar datasets under same data family.
Register your Data Family
Data families in MarkovML help organize or group related datasets. Think of a data family as a virtual folder containing all versions of datasets related to a specific topic.
For example, if you have datasets for sentiment analysis, you can group them under a "Sentiment Analysis" data family. Remember, once you register a dataset in MarkovML, you can't update it.
You can create a data family either through the Web UI or using the Markov SDK. Make sure to create a data family before registering any related datasets.
1. Creating a Data Family Using the MarkovML Web UI
Follow the below step to create a new data family in MarkovML through the UI:
- Log In: Sign in to your MarkovML account.
- Navigate to Dataset Page: Click on
Dataset
and then onAdd New Dataset
. - Proceed to Dataset Details: After choosing analyzers, click
Next
. A pop-up will appear asking for dataset details. - Choose or Create Data Family: Add the dataset to an existing data family or create a new one by selecting
Add new
. - Name and Describe: Give your data family a unique name and, if you want, a short description.
- Save: Click
Save
to complete the process.
2. Create a Data Family Using the Markov SDK
You can create a new data family directly from the Markov SDK using themarkov.data.register_datafamily()
method by providing the following info:
name
: Give a unique name to the data family.notes
: Add notes or descriptions for future references. (optional)lang
: Set it as "en_us" if the dataset content is written in US English.source
: Write the source name, such as Kaggle, for future reference. (optional)
Sample Code
import markov
# Create a new data family for the dataset
df_reg_resp = markov.data.register_datafamily(
name="Hate Speech Data Family", # Unique Data Family Name
notes="This is a data family for hate speech datasets",
lang="en-us",
source="SOURCE_OF_THIS_DATASET",#e.g kaggle, customer_alpha, annotation_
)
Now that you have successfully created a data family let's move on to registering your dataset with MarkovML.
Updated 6 days ago