Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
* [OT?] spamoracle concurrency
@ 2007-04-13  6:06 Ian Zimmerman
  2007-04-13  6:39 ` [Caml-list] " Alain Frisch
  2007-04-13  6:47 ` Gabriel Kerneis
  0 siblings, 2 replies; 5+ messages in thread
From: Ian Zimmerman @ 2007-04-13  6:06 UTC (permalink / raw)
  To: caml-list


I couldn't find a forum specifically for spamoracle :-)

The question is: does spamoracle do any kind of locking on the database?
And what kind?  Clearly, if I pipe my mails through "spamoracle mark" in
my procmail (or maildrop, etc.) configuration, spamoracle may run and
access the database at completely unpredictable times.  Is it safe to
do "spamoracle add" while this is enabled?  Or do I have to slap a
locking scheme on top myself?

Bonus question, if locking is in fact done, does it let multiple
"spamoracle mark" processes through at the same time (which should be safe)?

I found to my dismay that none of the ~5 bayesian type filters I tried
today answers these (obvious?) questions in their documentation :-(

-- 
Experience with Asset Control an asset.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] [OT?] spamoracle concurrency
  2007-04-13  6:06 [OT?] spamoracle concurrency Ian Zimmerman
@ 2007-04-13  6:39 ` Alain Frisch
  2007-04-13  6:56   ` Gabriel Kerneis
  2007-04-13  6:47 ` Gabriel Kerneis
  1 sibling, 1 reply; 5+ messages in thread
From: Alain Frisch @ 2007-04-13  6:39 UTC (permalink / raw)
  To: Ian Zimmerman; +Cc: caml-list

Ian Zimmerman wrote:
> The question is: does spamoracle do any kind of locking on the database?
> And what kind?  Clearly, if I pipe my mails through "spamoracle mark" in
> my procmail (or maildrop, etc.) configuration, spamoracle may run and
> access the database at completely unpredictable times.  Is it safe to
> do "spamoracle add" while this is enabled?  Or do I have to slap a
> locking scheme on top myself?
> 
> Bonus question, if locking is in fact done, does it let multiple
> "spamoracle mark" processes through at the same time (which should be safe)?

Looking at the code very: spamoracle write its database to a fresh 
temporary file and then rename it. With a local file system (not NFS) 
under Unix, this is atomic. It is thus safe to "mark" and "add" in 
parallel, but if you do several "add" in parallel, you'll not get the 
expected behavior (but the database will not be corrupted).

-- Alain


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] [OT?] spamoracle concurrency
  2007-04-13  6:06 [OT?] spamoracle concurrency Ian Zimmerman
  2007-04-13  6:39 ` [Caml-list] " Alain Frisch
@ 2007-04-13  6:47 ` Gabriel Kerneis
  1 sibling, 0 replies; 5+ messages in thread
From: Gabriel Kerneis @ 2007-04-13  6:47 UTC (permalink / raw)
  To: caml-list; +Cc: Ian Zimmerman

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

Le 13 Apr 2007 02:06:04 -0400, Ian Zimmerman <itz@madbat.mine.nu> a
écrit :
> 
> I couldn't find a forum specifically for spamoracle :-)
> 
> The question is: does spamoracle do any kind of locking on the
> database? And what kind?  Clearly, if I pipe my mails through
> "spamoracle mark" in my procmail (or maildrop, etc.) configuration,
> spamoracle may run and access the database at completely
> unpredictable times.  Is it safe to do "spamoracle add" while this is
> enabled?  Or do I have to slap a locking scheme on top myself?

After a quick glance at the source code of spamoracle 1.4 (that I used
a few years ago for a school project), I would answer "no" : database
is read from file to memory, and if you do "spamoracle add", it will
update the database (in memory), then dump it back to a (temporary)
file which will eventually being renamed and erase your previous
database. As far as I can see, there is no locking of any kind.
But if you process your mails one by one, there shouldn't be any
problem either ; it depends on your MDA. 

Regards,
-- 
Gabriel Kerneis


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] [OT?] spamoracle concurrency
  2007-04-13  6:39 ` [Caml-list] " Alain Frisch
@ 2007-04-13  6:56   ` Gabriel Kerneis
  2007-04-13  7:57     ` Alain Frisch
  0 siblings, 1 reply; 5+ messages in thread
From: Gabriel Kerneis @ 2007-04-13  6:56 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 973 bytes --]

Le Fri, 13 Apr 2007 08:39:10 +0200, Alain Frisch
<Alain.Frisch@inria.fr> a écrit :
> Looking at the code very: spamoracle write its database to a fresh 
> temporary file and then rename it. With a local file system (not NFS) 
> under Unix, this is atomic. It is thus safe to "mark" and "add" in 
> parallel, but if you do several "add" in parallel, you'll not get the 
> expected behavior (but the database will not be corrupted).

(I'm not an Unix guru so this might be a silly question)

What about the following scenario :
1) "spamoracle add" reads the database, updates it and writes it to a
fresh file
2) "spamoracle read" begins to read the database from file
3) "spamoracle add" renames the file
4) "spamoracle read" finishes to read the database from file and closes
the file (which no longer exists ?)

Is it safe ? I guess it's only changing the i-nodes but does it
influence "spamoracle read" in any way ?

Regards,
-- 
Gabriel Kerneis


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] [OT?] spamoracle concurrency
  2007-04-13  6:56   ` Gabriel Kerneis
@ 2007-04-13  7:57     ` Alain Frisch
  0 siblings, 0 replies; 5+ messages in thread
From: Alain Frisch @ 2007-04-13  7:57 UTC (permalink / raw)
  To: Gabriel Kerneis; +Cc: caml-list

Gabriel Kerneis wrote:
> What about the following scenario :
> 1) "spamoracle add" reads the database, updates it and writes it to a
> fresh file
> 2) "spamoracle read" begins to read the database from file
> 3) "spamoracle add" renames the file
> 4) "spamoracle read" finishes to read the database from file and closes
> the file (which no longer exists ?)
> 
> Is it safe ? 

Yes, it is. In step 4, "spamoracle read" sees the old version of the
file through the already opened file descriptor.

-- Alain


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-04-13  7:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-13  6:06 [OT?] spamoracle concurrency Ian Zimmerman
2007-04-13  6:39 ` [Caml-list] " Alain Frisch
2007-04-13  6:56   ` Gabriel Kerneis
2007-04-13  7:57     ` Alain Frisch
2007-04-13  6:47 ` Gabriel Kerneis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox