Skip to content

Commit

Permalink
Add support for diacritics and for standalone Hamza characters
Browse files Browse the repository at this point in the history
  • Loading branch information
michaeldiscala committed Oct 12, 2015
1 parent 91bac50 commit 4a9e3d8
Showing 1 changed file with 19 additions and 1 deletion.
20 changes: 19 additions & 1 deletion lib/arabic-letter-connector/logic.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ def initialize(common, isolated, final, initial, medial, connects)
@connects = connects
end

# @return [Boolean] can the character connect with the next character
def connects?
@connects
end

end

# Determine the form of the current character (:isolated, :initial, :medial,
Expand Down Expand Up @@ -170,6 +170,24 @@ def self.charinfos
add("0640", "0640", "0640", "0640", "0640", true) # Tatweel
add("0649", "feef", "fef0", "feef", "fef0", false) # Alef Layyina

# Prevent words from breaking on diacritics by marking the diacritics as
# connected
#
# List of Diacritics pulled from http://unicode.org/charts/PDF/U0600.pdf
# under the heading "Tashkil from ISO 8859-6"
[
"064b", # FATHATAN
"064c", # DAMMATAN
"064D", # KASRATAN
"064E", # FATHA
"064F", # DAMMA
"0650", # KASRA
"0651", # SHADDA
"0652" # SUKUN
].each do |codepoint|
add(codepoint, codepoint, codepoint, codepoint, codepoint, true)
end

# The common codes for these four Lam-Alef characters are in the
# Arabic Presentation Forms-B block (rather than the regular Arabic block),
# because they are introduced by the replace_lam_alef function
Expand Down

0 comments on commit 4a9e3d8

Please sign in to comment.