Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Takayuki Kazama <kazaan@hh.iij4u.or.jp>
To: caml-list@inria.fr
Subject: [Caml-list] ocamlnet doesn't correspond to pcre from 04/29 change onward
Date: Thu, 14 Oct 2004 22:58:26 +0900	[thread overview]
Message-ID: <20041014135826.GA2464@hh.iij4u.or.jp> (raw)

Hello, all.

I've found the code in ocamlnet-0.98 should be fix to correspond to 
-new- Pcre.get_substring.

Since 2004-04-29, the function "get_substring" does not return null string
even if accessed substring was not captured. It raises "Not_found".
(Check pcre-ocaml's changelog)

So now, it seems that Netencoding.Html.decode don't work right way.
I've checked this at ...
	Debian GNU/Linux(sid)	ocaml			3.08.1-1
				libpcre-ocaml		5.08.1-2
				libocamlnet-ocaml	0.98-2.1
and
	M$ Windows2000		Ocaml-MinGW-Maxi Distibution.
(Note: netstring/Makefile was changed for ocaml/mingw build
because of no ocamlmklib in ocaml/mingw)

Problem is like bellow.

shell$ ocaml
# #use "topfind";;
# #require "netstring";;
# Netencoding.Html.encode_from_latin1 "(a<b)&(c>d)";;
- : string = "(a&lt;b)&amp;(c&gt;d)"
	(* We could `encode' with propriety. But... *)
# Netencoding.Html.decode_to_latin1 "(a&lt;b)&amp;(c&gt;d)"
Exception: Not_found.
	(* Oops. So now try by the number-encoding -- like &#60 -- *)
# Netencoding.Html.decode_to_latin1 "(a&#60;b)&#38;(c&#62;d)"
- : string = "(a<b)&(c>d)"
	(* Good. So we cannot decode "&[a-zA-Z];" encoding only *)

To fix this problem, change netencoding.ml like bellow (diff -c)
*** netencoding.ml		Thu Oct 14 20:54:56 2004
--- netencoding.ml.org		Sat Sep  4 22:20:36 2004
***************
*** 1536,1547 ****
	 (* TODO: avoid String.sub *)
	 let occurence = occurences.(k) in
	 let replacement =
!	   try
!	     let n = int_of_string (Pcre.get_substring occurence 2) in
!	       makechar n
!	   with Not_found ->
!	     let name = Pcre.get_substring occurence 3 in
!	       lookup_entity name
	 in
	 Buffer.add_string buf replacement;
	 n := n1;
--- 1536,1551 ----
	 (* TODO: avoid String.sub *)
	 let occurence = occurences.(k) in
	 let replacement =
!	   let num = Pcre.get_substring occurence 2 in   (* or " *)
!	   if num <> "" then begin
!	     let n = int_of_string num in
!	     makechar n
!	   end
!	   else begin
!	     let name = Pcre.get_substring occurence 3 in  (* or "" *)
!	     assert(name <> "");
!	     lookup_entity name
!	   end
	 in
	 Buffer.add_string buf replacement;
	 n := n1;

And I suggest that change the functions "matched_string" and "matched_group".
Because Pcre.get_substring will raise Not_found now 8)

regards.

---
Takayuki Kazama

mail : kazaan@hh.iij4u.or.jp
blog : http://kazaan.no-ip.com/~kazaan/ (SORRY JAPANESE ONLY)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


                 reply	other threads:[~2004-10-14 13:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041014135826.GA2464@hh.iij4u.or.jp \
    --to=kazaan@hh.iij4u.or.jp \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox