首页手记 Request 接收参数乱码原理解析

Request 接收参数乱码原理解析

标签：

资讯

今天早上被同事问了一个问题：说接收到的参数是乱码，让我帮着解决一下。

实际情景：

同事负责的平台是Ext.js框架搭建的，web.config配置文件里配置了全局为“GB2312”编码:

<globalization requestEncoding="gb2312" responseEncoding="gb2312" fileEncoding="gb2312" culture="zh-CN"/>

当前台提交“中文文字”时，后台用Request.QueryString["xxx"]接收到的是乱码。

无论用System.Web.HttpUtility.UrlDecode("xxx","编码类型")怎么解码都无效。

原理说明：

1：首先确定的是：客户端的url参数在提交时，Ext.js会对其编码再提交，而客户端的编码默认是utf-8编码

客户端默认有三种编码函数：escape() encodeURI() encodeURIComponent()

2：那为什么用Request.QueryString["xxx"]接收参数时，收到的会是乱码？

为此，我们必须解开Request.QueryString的原始处理逻辑过程

我们步步反编绎，

2.1：看QueryString属性的代码：

public NameValueCollection QueryString
{
    get
    {
        if (this._queryString == null)
        {
            this._queryString = new HttpValueCollection();
            if (this._wr != null)
            {
                this.FillInQueryStringCollection();//重点代码切入点
            }
            this._queryString.MakeReadOnly();
        }
        if (this._flags[1])
        {
            this._flags.Clear(1);
            ValidateNameValueCollection(this._queryString, "Request.QueryString");
        }
        return this._queryString;
    }
}

2.2：切入 FillInQueryStringCollection()方法

private void FillInQueryStringCollection()
{
    byte[] queryStringBytes = this.QueryStringBytes;
    if (queryStringBytes != null)
    {
        if (queryStringBytes.Length != 0)
        {
            this._queryString.FillFromEncodedBytes(queryStringBytes, this.QueryStringEncoding);
        }
    }//上面是对流字节的处理，即文件上传之类的。
    else if (!string.IsNullOrEmpty(this.QueryStringText))
    {
        //下面这句是对普通文件提交的处理:FillFromString是个切入点,编码切入点是：this.QueryStringEncoding
        this._queryString.FillFromString(this.QueryStringText, true, this.QueryStringEncoding);
        
    }
}

2.3：切入：QueryStringEncoding

internal Encoding QueryStringEncoding
{
    get
    {
        Encoding contentEncoding = this.ContentEncoding;
        if (!contentEncoding.Equals(Encoding.Unicode))
        {
            return contentEncoding;
        }
        return Encoding.UTF8;
    }
}
//点击进入this.ContentEncoding则为：
public Encoding ContentEncoding
{
    get
    {
        if (!this._flags[0x20] || (this._encoding == null))
        {
            this._encoding = this.GetEncodingFromHeaders();
            if (this._encoding == null)
            {
                GlobalizationSection globalization = RuntimeConfig.GetLKGConfig(this._context).Globalization;
                this._encoding = globalization.RequestEncoding;
            }
            this._flags.Set(0x20);
        }
        return this._encoding;
    }
    set
    {
        this._encoding = value;
        this._flags.Set(0x20);
    }
}

说明：

从QueryStringEncoding代码得出，系统默认会先取globalization配置节点的编码方式，如果取不到，则默认为UTF-8编码方式

2.4：切入 FillFromString(string s, bool urlencoded, Encoding encoding)

internal void FillFromString(string s, bool urlencoded, Encoding encoding)
{
    int num = (s != null) ? s.Length : 0;
    for (int i = 0; i < num; i++)
    {
        int startIndex = i;
        int num4 = -1;
        while (i < num)
        {
            char ch = s[i];
            if (ch == '=')
            {
                if (num4 < 0)
                {
                    num4 = i;
                }
            }
            else if (ch == '&')
            {
                break;
            }
            i++;
        }
        string str = null;
        string str2 = null;
        if (num4 >= 0)
        {
            str = s.Substring(startIndex, num4 - startIndex);
            str2 = s.Substring(num4 + 1, (i - num4) - 1);
        }
        else
        {
            str2 = s.Substring(startIndex, i - startIndex);
        }
        if (urlencoded)//外面的传值默认是true,所以会执行以下语句
        {
            base.Add(HttpUtility.UrlDecode(str, encoding), HttpUtility.UrlDecode(str2, encoding));
        }
        else
        {
            base.Add(str, str2);
        }
        if ((i == (num - 1)) && (s[i] == '&'))
        {
            base.Add(null, string.Empty);
        }
    }
}

说明：

从这点我们发现：所有的参数输入，都调用了一次：HttpUtility.UrlDecode(str2, encoding);

3：结论出来了

当客户端js对中文以utf-8编码提交到服务端时，用Request.QueryString接收时，会先以globalization配置的gb2312去解码一次，于是，产生了乱码。

所有的起因为：

1：js编码方式为urt-8

2：服务端又配置了默认为gb2312

3：Request.QueryString默认又会调用HttpUtility.UrlDecode用系统配置编码去解码接收参数。

补充：

1：系统取默认编码的顺序为：http请求头->globalization配置节点-》默认UTF-8

2：在Url直接输入中文时，不同浏览器处理方式可能不同如：ie不进行编码直接提交，firefox对url进行gb2312编码后提交。

3：对于未编码“中文字符”，使用Request.QueryString时内部调用HttpUtility.UrlDecode后，由gb2312->utf-8时，

如果查不到该中文字符，默认转成"%ufffd"，因此出现不可逆乱码。

4：解决之路

知道了原理，解决的方式也有多种多样了：

1：全局统一为UTF-8编码，省事又省心。

2：全局指定了GB2312编码时，url带中文，js非编码不可，如ext.js框架。

这种方式你只能特殊处理，在服务端指定编码解码，
因为默认系统调用了一次HttpUtility.UrlDecode（"xxx",系统配置的编码)，
因此你再调用一次HttpUtility.UrlEncode（"xxx",系统配置的编码)，返回到原始urt-8编码参数
再用HttpUtility.UrlDecode（"xxx",utf-8)，解码即可。

5：其它说明：默认对进行一次解码的还包括URI属性,而Request.RawUrl则为原始参数

点击查看更多内容

为 TA 点赞

若觉得本文不错，就分享一下吧！

评论

评论

共同学习，写下你的评论

评论加载中...

展开查看更多评论

作者其他优质文章

正在加载中

守着一只汪

手记
篇

粉丝

11

获赞与收藏

37

关注作者，订阅最新文章

阅读免费教程

后端通用面试教程

41个小节 28714 323

网络编程入门教程

20个小节 11930 226

Pandas 入门教程

25个小节 17377 314

推荐

评论

收藏

共同学习，写下你的评论



感谢您的支持，我会继续努力的～

扫码打赏，你说多少就多少

赞赏金额会直接到老师账户

支付方式

打开微信扫一扫，即可进行扫码打赏哦

今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与放弃机会

点击
抽奖

慕课手记新用户专享福利

恭喜你，你的运气太好了，居然抽中了 100个积分！

恭喜你，抽中了价值元的专栏！

太棒了，直接落到你账户里！

积分商城里的罗技鼠标、机械键盘、
Kindle 阅读器、小米平衡车
Apple iPad （10.2英寸）、大额优惠券
在等着你去兑换了噢

作者：

免费赠送

兑换码：1111222211 复制

优惠券可用于购买实战课、体系课
无门槛使用

先去看看，有什么好东西马上兑换我爱学习，选课去


热搜

最近搜索清空

Request 接收参数乱码原理解析

阅读免费教程