You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I paste manually this text of Plutarch into Arethusa (taken also from http://demo.fragmentarytexts.org/en/revolt-of-samos/sophocles-strategia.html):
καί ποτε τοῦ Σοφοκλέους, ὅτε συστρατηγῶν ἐξέπλευσε μετ᾽ αὐτοῦ, παῖδα καλὸν ἐπαινέσαντος, ‛οὐ μόνον,’ ἔφη, ‛τὰς χεῖρας, ὦ Σοφόκλεις, δεῖ καθαρὰς ἔχειν τὸν στρατηγόν, ἀλλὰ καὶ τὰς ὄψεις.’
-- in tokenizing, the opening quotation marks in ‛οὐ and ‛τὰς don't get separated from the words (single ' as well as double ").
The text was updated successfully, but these errors were encountered:
The same goes for em dash - the following passage from Plato's Alcibiades 104a is not tokenized correctly: ... κάλλιστός τε καὶ μέγιστος—καὶ τοῦτο μὲν δὴ παντὶ δῆλον ἰδεῖν ὅτι οὐ ψεύδῃ—ἔπειτα νεανικωτάτου...; I am offered μέγιστος—καὶ and ψεύδῃ—ἔπειτα as single words.
Hi Neven,
thanks for the report! Fixed this yesterday - we'll soon deploy a next iteration of the tokenization stuff, need to add a couple of tests in the next couple of days then we're good to go!
When I paste manually this text of Plutarch into Arethusa (taken also from http://demo.fragmentarytexts.org/en/revolt-of-samos/sophocles-strategia.html):
καί ποτε τοῦ Σοφοκλέους, ὅτε συστρατηγῶν ἐξέπλευσε μετ᾽ αὐτοῦ, παῖδα καλὸν ἐπαινέσαντος, ‛οὐ μόνον,’ ἔφη, ‛τὰς χεῖρας, ὦ Σοφόκλεις, δεῖ καθαρὰς ἔχειν τὸν στρατηγόν, ἀλλὰ καὶ τὰς ὄψεις.’
-- in tokenizing, the opening quotation marks in ‛οὐ and ‛τὰς don't get separated from the words (single ' as well as double ").
The text was updated successfully, but these errors were encountered: