Page 2 of 3

Re: Hypothetical: Making a word game and needing to check wo

Posted: Tue Jan 24, 2012 8:19 pm
by Robin
kikito wrote:
Robin wrote:(but there is also a lot of overhead for all those strings and tables)
For the tables, yes. For the strings, not so much. A big table with all the possible words would probably use up more memory IMHO.
For words with lots of overlap in prefixes (for example, all words start with "foo", a try would save three characters per word), it might take up less memory. But you'll end up with lots of short strings, and in Lua each string is a value, with associated overhead in memory, for garbage collection and things like that.

We should test is really, but I'm not really in the mood for SCIENCE! right now.

I think I tried tries in Python, but it might also have been Lua, and it had really bad performance.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Tue Jan 24, 2012 8:30 pm
by Erniecz
I downloaded a list of words in the Czech from here:
http://www.pspad.com/en/download.php
(there is also English one and much more). Size of unpacked list is 10 MB and list contains 808 056 words. I tried this code:

Code: Select all

function love.load ()
  words = {}
  for line in love.filesystem.lines("Czech.3-2-1.dic") do
    words [line] = true
  end
  fps = 0
  word = ""
end


function love.update (dt)
  fps = love.timer.getFPS ()
end

function isLetter (char)
  return (char == 'a' or char == 'b' or char == 'c' or char == 'd' or char == 'e' or char == 'f' or char == 'g' or char == 'h' or char == 'i' or char == 'j'
   or char == 'k' or char == 'l' or char == 'm' or char == 'n' or char == 'o' or char == 'p' or char == 'q' or char == 'r' or char == 's' or char == 't'
   or char == 'u' or char == 'v' or char == 'w' or char == 'x' or char == 'y' or char == 'z')
end

function love.keypressed (key, unicode)
  if isLetter (key) then
    word = word..key
  elseif key == "backspace" then
    word = ""
  end	
end

function love.draw ()
  love.graphics.print (fps, 0, 0)
  love.graphics.print (word, 0, 12)
  love.graphics.print (tostring(words [word]), 0, 24)
end
Creating table at start of the program is slow (about one minute on my PC), but using it to look up for words is quick enough. I think if you just used smaller wordlist (English one there is smaller) it would be totally sufficient.
Also you could try to create table once, save it to the file and just run it, it could be faster.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 6:02 am
by ivan

Code: Select all

function isLetter (char)
  return (char == 'a' or char == 'b' or char == 'c' or char == 'd' or char == 'e' or char == 'f' or char == 'g' or char == 'h' or char == 'i' or char == 'j'
   or char == 'k' or char == 'l' or char == 'm' or char == 'n' or char == 'o' or char == 'p' or char == 'q' or char == 'r' or char == 's' or char == 't'
   or char == 'u' or char == 'v' or char == 'w' or char == 'x' or char == 'y' or char == 'z')
end
could become:

Code: Select all

function isLetter(char)
  local byte = string.byte(char)
  return (byte >= 97 and byte <= 122) or (byte >= 65 and byte <= 90)
end
Lowercase letters are between 97 and 122, upper case is between 65 and 90

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 6:48 am
by Jasoco
A thousand Internet Doubloons to anyone who creates a "Word legality checker" library. It would be very useful if anyone ever needs to create a word game.

*Approximate value of one Internet Doubloon is 1/1000th of a cent. Internet Doubloons are redeemable on http://www.amazon.edu. Internet Doubloons are non-transferrable.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 6:55 am
by MarekkPie
Homer: One adult and four children.
Woman: Would you like to buy some Itchy and Scratchy Money?
Homer: What's that?
Woman: Well it's money that's made just for the park. It works just
like regular money, but it's, er..."fun".
Bart: Do it, Dad.
Homer: Well, OK, if it's fun...let's see, uh...I'll take $1100 worth.
[he walks in, sees all the signs: "No I&S Money", "We Don't Take
Itchy and Scratchy Money", etc.]
Aw!

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 8:32 am
by miko
ivan wrote:could become:

Code: Select all

function isLetter(char)
  local byte = string.byte(char)
  return (byte >= 97 and byte <= 122) or (byte >= 65 and byte <= 90)
end
which could become:

Code: Select all

function isLetter(char)
  return char:match("%a") and true or false
end
ivan wrote: Lowercase letters are between 97 and 122, upper case is between 65 and 90
In English. For other languages it depends on the character set and the encoding.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 8:34 am
by miko
MarekkPie wrote:http://en.wikipedia.org/wiki/DAWG <== This is the one you're looking for.
Nice catch! Yes, that is what I would choose. This is just designed for exactly this kind of problems.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Wed Jan 25, 2012 10:20 am
by Robin
Or just replace isLetter(char) with char:match("%a") completely. It's only use in an if-statement, which handles non-booleans just fine.

So:

Code: Select all

function love.keypressed (key, unicode)
  if key:match("^%a$") then -- needed, so we don't get lshift, rctrl, escape, ... 
    word = word..key
  elseif key == "backspace" then
    word = ""
  end   
end

Re: Hypothetical: Making a word game and needing to check wo

Posted: Thu Jan 26, 2012 5:36 am
by Taehl
One cheap-but-quick way to make checking quicker is to have a separate text file for each letter, listing all words starting with that letter. Want to check if "love" is a word? Check L.txt. There are probably more efficient ways, but this is probably the simplest.

Re: Hypothetical: Making a word game and needing to check wo

Posted: Thu Jan 26, 2012 1:21 pm
by Robin
Taehl wrote:One cheap-but-quick way to make checking quicker is to have a separate text file for each letter, listing all words starting with that letter. Want to check if "love" is a word? Check L.txt. There are probably more efficient ways, but this is probably the simplest.
It would be faster put them into tables. RAM is much faster than hard disk and Lua is very fast with tables.

Plus, even if those N.txt files are sorted, performance will be O(log(n)) instead of O(1). (In general. With a large amount of words the performance might lie closer to O(log(n)) for tables, as kikito said, but still, it will probably be much faster than reading from disk.)