java.net.URLEncoder 与 java.net.URLDecoder.decode

    xiaoxiao2021-03-25  71

    java.net.URLEncoder

    URLEncoder是用来对HTML表单编码,该类包含用于将字符串转换为application/x-www-form-urlencoded 格式的静态方法。 (application/x-www-form-urlencoded: 数据被编码为名/值对)

    当发出get请求时,浏览器用x-www-form-urlencoded的编码方式把form数据转换成一个字串(name1=value1&name2=value2…),然后把这个字串以请求参数形式附加到url后面。当发出post请求时,浏览器把form数据封装到http body中,然后发送到server。

    URLEncoder编码时遵循的规则

    a-z,A-Z,0-9保持原样。特殊字符.,-,*与_保持原样。空格被转换为+号。其它字符转换为%xy的形式,xy由16进制数来表示。 import java.io.UnsupportedEncodingException; import java.net.URLEncoder; public class Main { public static void main(String[] args) throws InterruptedException, ExecutionException { try { String string = "The string ü@foo-bar"; String encodedString = URLEncoder.encode(string, "UTF-8"); System.out.println("Encoded String: " + encodedString); } catch (UnsupportedEncodingException e) { e.printStackTrace(); } } }

    上述代码[1]使用UTF-8编码。The string ü@foo-bar被转换为The+string+ü@foo-bar因为UTF-8,ü被编码为两字节C3(16进制),BC(16进制),@被编码为一字节40(16进制)。

    java.net.URLDecoder

    URLDecoder用来解码,解格式为application/x-www-form-urlencoded的编码。

    public class Main { public static void main(String[] args) throws UnsupportedEncodingException { System.out.println(URLDecoder.decode("special+chars:+&%*+", "UTF-8")); } }

    上面代码[2]解码使用UTF-8的格式,最终输出:

    special chars: &%*

    java.net.URLDecoder.decode(String,String)源码分析

    public static String decode(String s, String enc) throws UnsupportedEncodingException{ boolean needToChange = false; int numChars = s.length(); StringBuffer sb = new StringBuffer(numChars > 500 ? numChars / 2 : numChars); int i = 0; if (enc.length() == 0) { throw new UnsupportedEncodingException ("URLDecoder: empty string enc parameter"); } char c; byte[] bytes = null; while (i < numChars) { c = s.charAt(i); switch (c) { case '+': sb.append(' '); i++; needToChange = true; break; case '%': /* * Starting with this instance of %, process all * consecutive substrings of the form %xy. Each * substring %xy will yield a byte. Convert all * consecutive bytes obtained this way to whatever * character(s) they represent in the provided * encoding. */ try { // (numChars-i)/3 is an upper bound for the number // of remaining bytes if (bytes == null) bytes = new byte[(numChars-i)/3]; int pos = 0; while ( ((i+2) < numChars) && (c=='%')) { int v = Integer.parseInt(s.substring(i+1,i+3),16); if (v < 0) throw new IllegalArgumentException("URLDecoder: Illegal hex characters in escape (%) pattern - negative value"); bytes[pos++] = (byte) v; i+= 3; if (i < numChars) c = s.charAt(i); } // A trailing, incomplete byte encoding such as // "%x" will cause an exception to be thrown if ((i < numChars) && (c=='%')) throw new IllegalArgumentException( "URLDecoder: Incomplete trailing escape (%) pattern"); sb.append(new String(bytes, 0, pos, enc)); } catch (NumberFormatException e) { throw new IllegalArgumentException( "URLDecoder: Illegal hex characters in escape (%) pattern - " + e.getMessage()); } needToChange = true; break; default: sb.append(c); i++; break; } } return (needToChange? sb.toString() : s);

    解码可总结为:

    1.对读入的字符串逐个遍历各个字符

    2.如果读入字符是+号,则转换为空格,如果不是以%号开头则不进行转换,直接返回。

    3.如果以%号开头 3.1预分配缓冲区,大小为最大可能剩余字符数

    bytes = new byte[(numChars-i)/3];

    numChars-i为剩余字符数,而%xy格式共3个字符。(numChars-i)/3表示最大剩余字符数。 因为最终转换时不包含%号,而xy以16进制表示,因此(numChars-i)/3表示转换所需要最大字节数。

    3.2进行相关错误处理,如果产生错误则抛出异常。

    参考

    [1]https://www.udemy.com/collection/java-code-geeks/all-courses/?pmtag=APRUDEMY17&siteID=fauDoMV7FnU-Gz9kCuhFvRfa4V26e0XAig&LSNPUBID=fauDoMV7FnU [2]https://examples.javacodegeeks.com/core-java/net/urldecoder/java-net-urldecoder-example/
    转载请注明原文地址: https://ju.6miu.com/read-23572.html

    最新回复(0)