Re: slug url encoding problem for Chinese Chacters
- From:
- wsh
- Date:
- 2010-06-20 @ 17:30
- Subject:
- Re: slug url encoding problem for Chinese Chacters
Armin Ronacher wrote:
> Hi,
>
> On 6/20/10 5:11 AM, wsh wrote:
>
>> I'v successfully set up a Zine develop instance on my machine. The only
>> problem is that if I leave a comment on a post which title contains
>> Chinese Characters, after I submited my comment Zine show me a Page Not
>> Found error page.
>>
> Two questions. Are you using the development version? There this
> should not happen. It will fall back to numbers in that case.
> Alternatively you can disable ASCII-only slugs in the Admin panel.
>
>
> Regards,
> Armin
>
>
Yes. I get the latest code from the codebase
$ hg clone http://dev.pocoo.org/hg/zine-main zine
I've try ASCII-only slugs two. It seems don't work. I checked the code and find some clue.
The problem seems cased by the "_redirect_target" hidden field in the form. I captured http request headers in firefox as follows:
http://localhost:4000/2010/6/20/%E6%B1%89
GET /2010/6/20/%E6%B1%89 HTTP/1.1
Host: localhost:4000
User-Agent: Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-cn,zh;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: GB2312,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://localhost:4000/2010/6/20/%25E6%25B1%2589
Cookie: zine_session="ltMjGAFsomUqGyUae33eAa6W/Y4=?_expires=STEyNzk3MjI0ODkKLg==<=RjEyNzY5NTQ5ODguNjEwODY5OQou&pmt=STAxCi4=&uid=STEKLg=="
Cache-Control: max-age=0
The last segment of the line (http://localhost:4000/2010/6/20/%E6%B1%89) is the utf-8 code of a chinese character which are three bytes and it's fine.
But the Referer header (Referer: http://localhost:4000/2010/6/20/%25E6%25B1%2589) is incorrect, it got a additional 25 for each byte.
Function get_redirect_target in zine/utils/http.py use this Referer header to calculate value for "_redirect_target" hidden field.
That's why the forms are redirected to wrong location. But I can't figure out where the additional "25" (which is a hex value for ascii % char) come from.
Regards,
Shuhao
Re: slug url encoding problem for Chinese Chacters
- From:
- Armin Ronacher
- Date:
- 2010-06-20 @ 20:23
- Subject:
- Re: slug url encoding problem for Chinese Chacters
Hi,
On 6/20/10 7:30 PM, wsh wrote:
> That's why the forms are redirected to wrong location. But I can't
> figure out where the additional "25" (which is a hex value for ascii
> % char) come from.
May I ask what browser you are using? I can try to debug that problem
next week but I'm quite busy right now so in case you have any
experiences with debugging Python apps any help would be greatly
appreciated. I suppose it is caused by either improperly
encoding/decoding somewhere.
Regards,
Armin
Re: slug url encoding problem for Chinese Chacters
- From:
- Kiran Jonnalagadda
- Date:
- 2010-06-26 @ 09:16
- Subject:
- Re: slug url encoding problem for Chinese Chacters
On Sun, Jun 20, 2010 at 11:00 PM, wsh <shuhao.w@gmail.com> wrote:
> I've try ASCII-only slugs two. It seems don't work. I checked the code and
> find some clue.
> The problem seems cased by the "_redirect_target" hidden field in the form.
> I captured http request headers in firefox as follows:
>
This may be a case of the _charset_ feature. See:
http://www.crazysquirrel.com/computing/general/form-encoding.jspx
https://bugzilla.mozilla.org/show_bug.cgi?id=18643
I had a quick look at the Zine tip source and didn't notice _charset_ being
used. Basically, the form needs to have a hidden field named _charset_, with
no value:
<input type="hidden" name="_charset_">
The browser will then submit this with the name of the charset encoding --
by default ISO-8859-1, but in UTF-8 if the page is so. I'm not sure what
happens if this field is missing, but most likely, it's not submitting in
UTF-8. Please add the input field to your template and see if that fixes it.