基于熵估计的安全协议密文域识别方法

doi:10.11999/JEIT151205

摘要
图/表
参考文献(21)
相关文章 (3)

全文: PDF (618 KB)
输出: BibTeX | EndNote (RIS)

摘要

现有基于网络报文流量信息的协议分析方法仅考虑报文载荷中的明文信息，不适用于包含大量密文信息的安全协议。为充分发掘利用未知规范安全协议的密文数据特征，针对安全协议报文明密文混合、密文位置可变的特点，该文提出一种基于熵估计的安全协议密文域识别方法CFIA(Ciphertext Field Identification Approach)。在挖掘关键词序列的基础上，利用字节样本熵描述网络流中字节的分布特性，并依据密文的随机性特征，基于熵估计预定位密文域分布区间，进而查找密文长度域，定位密文域边界，识别密文域。实验结果表明，该方法仅依靠网络数据流量信息即可有效识别协议密文域，并具有较高的准确率。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	朱玉娜
	韩继红
	袁霖
	谷文
	范钰丹

关键词 ：未知安全协议, 协议格式, 密文域, 熵估计

Abstract：

Previous network-trace-based methods only consider the plaintext format of payload data, and are not suitable for security protocols which include a large number of ciphertext data; therefore, a novel approach named CFIA (Ciphertext Field Identification Approach) is proposed based on entropy estimation for unknown security protocols. On the basis of keywords sequences extraction, CFIA utilizes byte sample entropy and entropy estimation to pre-locate ciphertext filed, and further searches ciphertext length field to identify ciphertext field. The experimental results show that without using dynamic binary analysis, the proposed method can effectively identify ciphertext fields purely from network traces, and the inferred formats are highly accurate in identifying the protocols.

Key words： Unknown security protocol Protocol format Ciphertext field Entropy estimation

收稿日期: 2015-10-29 出版日期: 2016-05-09

PACS:

TP393.08

基金资助:

国家自然科学基金(61309018)

通讯作者: 朱玉娜：女，1985年生，博士生，研究方向为安全协议逆向与识别. E-mail: zyn_qingdao@126.com

作者简介: 朱玉娜：女，1985年生，博士生，研究方向为安全协议逆向与识别. 韩继红：女，1966年生，教授，博士生导师，研究方向为网络与信息安全、安全协议形式化分析与自动化验证. 袁霖：男，副教授，研究方向为安全协议形式化分析与自动化验证、软件可信性分析.

引用本文:

朱玉娜,韩继红,袁霖,谷文,范钰丹. 基于熵估计的安全协议密文域识别方法[J]. 电子与信息学报, 2016, 38(8): 1865-1871. ZHU Yuna, HAN Jihong, YUAN Lin, GU Wen, FAN Yudan. Protocol Ciphertext Field Identification by Entropy Estimating. JEIT, 2016, 38(8): 1865-1871.

链接本文:

http://jeit.ie.ac.cn/CN/10.11999/JEIT151205 或 http://jeit.ie.ac.cn/CN/Y2016/V38/I8/1865

[1]	CABALLERO J, YIN H, LIANG Zhenkai, et al. Polyglot: automatic extraction of protocol message format using dynamic binary analysis[C]. Proceedings of the 14th ACM Conference on Computer and Communications Security, New York: 2007: 317-329. doi: 10.1145/1315245.1315286.
[2]	CUI Weidong, PEINADO M, CHEN K, et al. Automatic reverse engineering of input format[P]. USA, 8935677 B2, 2015-1-13.
[3]	WANG Zhi, JIANG Xuxian, CUI Weidong, et al. ReFormat: Automatic reverse engineering of encrypted messages[C]. European Symposium on Research in Computer Security, Berlin, 2009: 200-215. doi: 10.1007/978-3-642-04444-1_13.
[4]	CABALLERO J, POOSANKAM P, KREIBICH C, et al. Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering[C]. Proceedings of the 16th ACM Conference on Computer and Communications Security, New York, 2009: 621-634. doi: 10.1145/1653662. 1653737.
[5]	CABALLERO J and SONG D. Automatic protocol reverse- engineering: message format extraction and field semantics inference[J]. Computer Network, 2013, 57(2): 451-474. doi: 10.1016/j.comnet.2012.08.003.
[6]	BEDDOE M. The protocol information project[EB/OL]. http://www.4tphi.net/~awalters/PI/PI.html, 2004.
[7]	CUI Weidong, KANNAN J, and WANG H J. Discoverer: Automatic protocol reverse engineering from network traces[C]. Proceedings of the 16th USENIX Security Symposium, Berkeley, 2007: 199-212.
[8]	黎敏, 余顺争. 抗噪的未知应用层协议报文格式最佳分段方法[J]. 软件学报, 2013, 24(3): 604-617. doi: 10.3724/SP.J. 1001.2013.04243.
	LI Min and YU Shunzheng. Noise-tolerant and optimal segmentation of message formats for unknown application- layer protocols[J]. Journal of Software, 2013, 24(3): 604-617. doi: 10.3724/SP.J.1001.2013.04243.
[9]	LUO Jianzhen and YU Shunzheng. Position-based automatic reverse engineering of network protocols[J]. Journal of Network and Computer Applications, 2013, 36(3): 1070-1077. doi: 10.1016/j.jnca.2013.01.013.
[10]	ZHANG Zhuo, ZHANG Zhibin, Lee P P C, et al. Toward unsupervised protocol feature Word extraction[J]. IEEE Journal on Selected Areas in Communications, 2014, 32(10): 1894-1906. doi: 10.1109/JSAC.2014.2358857.
[11]	TÉTARD O. Netzob[OL]. http://www.netzob.org/, 2013.
[12]	BOSSERT G, GUIHÉRY F, and HIET G. Towards automated protocol reverse engineering using semantic information[C]. Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, Kyoto, 2014: 51-62. doi: 10.1145/2590296.2590346.
[13]	KUMANO Y, ATA S, NAKAMURA N, et al. Towards real- time processing for application identification of encrypted traffic[C]. International Conference on Computing, Networking and Communications, Honolulu, HI, 2014: 136-140. doi: 10.1109/ICCNC.2014.6785319.
[14]	赵博, 郭虹, 刘勤让, 等. 基于加权累积和检验的加密流量盲识别算法[J]. 软件学报, 2013, 24(6): 1334-1345. doi: 10. 3724/SP.J.1001.2013.04279.
	ZHAO Bo, GUO Hong, LIU Qinrang, et al. Protocol independent identification of encrypted traffic based on weighted cumulative sum test[J]. Journal of Software, 2013, 24(6): 1334-1345. doi: 10.3724/SP.J.1001.2013.04279.
[15]	OLIVAIN J and GOUBAULT-LARRECQ J. Detecting subverted cryptographic protocols by entropy checking[R]. LSV-06-13, 2006.
[16]	BONFIGLIO D, MELLIA M, MEO M, et al. Revealing skype traffic: when randomness plays with you[C]. Proceedings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Kyoto, 2007: 37-48. doi: 10.1145/1282380. 1282386.
[17]	PANINSKI L. A coincidence-based test for uniformity given very sparsely sampled discrete data[J]. IEEE Transactions on Information Theory, 2008, 54(10): 4750-4755. doi: 10.1109/ TIT.2008.928987.
[18]	MACCDC traces[OL]. http://www.netresec.com/?page= MACCDC, 2012.
[19]	InfoVisContest traces[DB/OL]. http://2009.hack.lu/index. php/InfoVisContest, 2009.
[20]	PIRONTI A, POZZA D, and SISTO R. Spi2Java User Manual-Version 3.1[R]. Turin: Piedmont: Italy, Polytechnic University of Turin, 2008.
[21]	ACETO G, DAINOTTI A, DONATO W, et al. PortLoad: taking the best of two worlds in traffic classification[C]. Proceedings of IEEE International Conference on Computer Communications, San Diego, CA, 2010: 1-5. doi: 10.1109/ INFCOMW.2010.5466645.