Skip to content

Commit c65865e

Browse files
committed
chore: Updating documentation for entity's join key (#2451)
Signed-off-by: pyalex <[email protected]>
1 parent 356788a commit c65865e

File tree

6 files changed

+27
-8
lines changed

6 files changed

+27
-8
lines changed

.github/workflows/master_only.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,4 +205,4 @@ jobs:
205205
make push-${{ matrix.component }}-docker REGISTRY=${REGISTRY} VERSION=${GITHUB_SHA}
206206
207207
docker tag ${REGISTRY}/${{ matrix.component }}:${GITHUB_SHA} ${REGISTRY}/${{ matrix.component }}:develop
208-
docker push ${REGISTRY}/${{ matrix.component }}:develop
208+
docker push ${REGISTRY}/${{ matrix.component }}:develop

docs/getting-started/concepts/entity.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ An entity is a collection of semantically related features. Users define entitie
66
driver = Entity(name='driver', value_type=ValueType.STRING, join_key='driver_id')
77
```
88

9-
Entities are typically defined as part of feature views. Entities are used to identify the primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view. It is also possible for feature views to have zero entities. See [feature view](feature-view.md) for more details.
9+
Entities are typically defined as part of feature views. Entity name is used to reference the entity from a feature view definition and join key is used to identify the physical primary key on which feature values should be stored and retrieved. These keys are used during the lookup of feature values from the online store and the join process in point-in-time joins. It is possible to define composite entities \(more than one entity object\) in a feature view. It is also possible for feature views to have zero entities. See [feature view](feature-view.md) for more details.
1010

1111
Entities should be reused across feature views.
1212

docs/getting-started/concepts/feature-retrieval.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,10 @@ online_features = fs.get_online_features(
2020
'driver_locations:lon',
2121
'drivers_activity:trips_today'
2222
],
23-
entity_rows=[{'driver': 'driver_1001'}]
23+
entity_rows=[
24+
# {join_key: entity_value}
25+
{'driver': 'driver_1001'}
26+
]
2427
)
2528
```
2629

docs/getting-started/quickstart.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -95,14 +95,16 @@ driver_hourly_stats = FileSource(
9595

9696
# Define an entity for the driver. You can think of entity as a primary key used to
9797
# fetch features.
98-
driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",)
98+
# Entity has a name used for later reference (in a feature view, eg)
99+
# and join_key to identify physical field name used in storages
100+
driver = Entity(name="driver", value_type=ValueType.INT64, join_key="driver_id", description="driver id",)
99101

100102
# Our parquet files contain sample data that includes a driver_id column, timestamps and
101103
# three feature column. Here we define a Feature View that will allow us to serve this
102104
# data to our model online.
103105
driver_hourly_stats_view = FeatureView(
104106
name="driver_hourly_stats",
105-
entities=["driver_id"],
107+
entities=["driver"], # reference entity by name
106108
ttl=Duration(seconds=86400 * 1),
107109
features=[
108110
Feature(name="conv_rate", dtype=ValueType.FLOAT),
@@ -162,14 +164,16 @@ driver_hourly_stats = FileSource(
162164

163165
# Define an entity for the driver. You can think of entity as a primary key used to
164166
# fetch features.
165-
driver = Entity(name="driver_id", value_type=ValueType.INT64, description="driver id",)
167+
# Entity has a name used for later reference (in a feature view, eg)
168+
# and join_key to identify physical field name used in storages
169+
driver = Entity(name="driver", value_type=ValueType.INT64, join_key="driver_id", description="driver id",)
166170

167171
# Our parquet files contain sample data that includes a driver_id column, timestamps and
168172
# three feature column. Here we define a Feature View that will allow us to serve this
169173
# data to our model online.
170174
driver_hourly_stats_view = FeatureView(
171175
name="driver_hourly_stats",
172-
entities=["driver_id"],
176+
entities=["driver"], # reference entity by name
173177
ttl=Duration(seconds=86400 * 1),
174178
features=[
175179
Feature(name="conv_rate", dtype=ValueType.FLOAT),
@@ -213,8 +217,13 @@ from feast import FeatureStore
213217
# The entity dataframe is the dataframe we want to enrich with feature values
214218
entity_df = pd.DataFrame.from_dict(
215219
{
220+
# entity's join key -> entity values
216221
"driver_id": [1001, 1002, 1003],
222+
223+
# label name -> label values
217224
"label_driver_reported_satisfaction": [1, 5, 3],
225+
226+
# "event_timestamp" (reserved key) -> timestamps
218227
"event_timestamp": [
219228
datetime.now() - timedelta(minutes=11),
220229
datetime.now() - timedelta(minutes=36),
@@ -320,6 +329,7 @@ feature_vector = store.get_online_features(
320329
"driver_hourly_stats:avg_daily_trips",
321330
],
322331
entity_rows=[
332+
# {join_key: entity_value}
323333
{"driver_id": 1004},
324334
{"driver_id": 1005},
325335
],

docs/how-to-guides/feast-snowflake-gcp-aws/read-features-from-the-online-store.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ fs = FeatureStore(repo_path="path/to/feature/repo")
3434
online_features = fs.get_online_features(
3535
features=features,
3636
entity_rows=[
37+
# {join_key: entity_value, ...}
3738
{"driver_id": 1001},
3839
{"driver_id": 1002}]
3940
).to_dict()

docs/tutorials/driver-stats-on-snowflake.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,12 @@ fs.materialize_incremental(end_date=datetime.now())
124124
{% code title="test.py" %}
125125
```python
126126
online_features = fs.get_online_features(
127-
features=features, entity_rows=[{"driver_id": 1001}, {"driver_id": 1002}],
127+
features=features,
128+
entity_rows=[
129+
# {join_key: entity_value}
130+
{"driver_id": 1001},
131+
{"driver_id": 1002}
132+
],
128133
).to_dict()
129134
```
130135
{% endcode %}

0 commit comments

Comments
 (0)