-
Notifications
You must be signed in to change notification settings - Fork 156
Open
Description
I'm trying to follow http://php-nlp-tools.com/posts/introducing-latent-dirichlet-allocation.html
But trying to call getDocumentsPerTopicsProbabilities at the end:
$docs = [
'The queen does something',
'Queen is very good queen',
'Mission mission mission',
'What is mission your mission'
];
$tok = new WhitespaceTokenizer();
$tset = new TrainingSet();
foreach ($docs as $line) {
$tset->addDocument(
'', // the class is not used by the lda model
new TokensDocument(
$tok->tokenize(
mb_strtolower($line)
)
)
);
}
$lda = new Lda(
new DataAsFeatures(), // a feature factory to transform the document data
2, // the number of topics we want
1, // the dirichlet prior assumed for the per document topic distribution
1 // the dirichlet prior assumed for the per word topic distribution
);
$lda->train($tset,50);
$lda->getDocumentsPerTopicsProbabilities(2);This results in:
Undefined offset: 0 at
vendor/nlp-tools/nlp-tools/src/NlpTools/Models/Lda.php:243
This probably requires something along the lines of:
if (!isset($count_topics_docs[$doc])) {
$count_topics_docs[$doc] = [];
}
if (!isset($count_topics_docs[$doc][$t])) {
$count_topics_docs[$doc][$t] = 0;
}also, further down you have a variable $limit_docs, which is undefined, maybe the signature of method is incorrect public function getDocumentsPerTopicsProbabilities($limit_docs = -1), maybe it's $limit_words there?
But, anyways, after running this method on this input:
$docs = [
'The queen does something',
'Queen is very good queen',
'Mission mission mission',
'What is mission your mission'
];
...
$lda->getDocumentsPerTopicsProbabilities(2);I get this result:
[
0.3333333333333333,
0.3333333333333333,
0.3333333333333333,
0.3333333333333333
]
And I'm not sure how to interpret that... :)
Thanks!
Metadata
Metadata
Assignees
Labels
No labels
