Skip to content

Commit

Permalink
Fixed pytest for word2vec embedding-using list of lists
Browse files Browse the repository at this point in the history
  • Loading branch information
vanshigupta04 committed Nov 9, 2023
1 parent 33eb1e9 commit b2a9a0e
Showing 1 changed file with 1 addition and 4 deletions.
5 changes: 1 addition & 4 deletions utils/util_modeler.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,15 +114,12 @@ def fit_transform(
np.ndarray: Word2Vec embeddings for the input text.
"""

embedding = []

# Initialize an array to store Word2Vec embeddings for the input text
words = self.tokenizer.tokenize(text) # Tokenize the document
word_vectors = [self.model[word] if word in self.model else np.zeros(self.model.vector_size) for word in words]
document_embedding = np.mean(word_vectors, axis=0) # Calculate the mean of word embeddings for the document
embedding.append(document_embedding)

return np.array(embedding)
return document_embedding.tolist()


class TPSampler:
Expand Down

0 comments on commit b2a9a0e

Please sign in to comment.