%\documentclass[a4paper,twoside]{article}
\documentclass[a4paper,11pt,twoside]{article}
%\documentclass[a4paper]{article}
\usepackage[latin1]{inputenc}
\usepackage{fancyhdr}
\usepackage{url}
\usepackage{hyperref}
\usepackage{listings}
\usepackage{color}
\usepackage{geometry}
\geometry{margin=3cm}
\author{Daniel Bosk\footnote{E-mail: \url{dbosk@kth.se}}}
\date{\today}
\title{DD2448 Foundations of Cryptography\\
A Presentation of SHA-3 Candidate \emph{Skein}}
\pagestyle{fancy}
\fancyhead{}
\fancyhead[CO]{Daniel Bosk}
\fancyhead[CE]{SHA-3 Candidate \emph{Skein}}
%\fancyhead[R]{Daniel Bosk}
%\fancyhead[L]{SHA-3 Candidate \emph{Skein}}
\fancyfoot[C]{\thepage}
\renewcommand{\headrulewidth}{0pt}
\setlength{\headheight}{14pt}
% för att skriva kod "inline" i texten.
\newcommand{\code}[1]{\lstinline£#1£}
% ställ in listings för att visa koden vackert
\definecolor{bkglisting}{gray}{0.97}
\definecolor{commentcol}{rgb}{0, 0.5, 0}
\lstset{
basicstyle=\footnotesize, % the size of the fonts that are used for the co
numbers=left, % where to put the line-numbers
numberstyle=\footnotesize, % the size of the fonts that are used for the li
stepnumber=1, % the step between two line-numbers. If it's 1 e
numbersep=5pt, % how far the line-numbers are from the code
backgroundcolor=\color{bkglisting}, % choose the background color. You must add
showspaces=false, % show spaces adding particular underscores
showstringspaces=false, % underline spaces within strings
showtabs=false, % show tabs within strings adding particular und
frame=single, % adds a frame around the code
tabsize=2, % sets default tabsize to 2 spaces
captionpos=b, % sets the caption-position to bottom
breaklines=true, % sets automatic line breaking
breakatwhitespace=false, % sets if automatic breaks should only happen at
escapeinside={\%*}{*)}, % if you want to add a comment within your code
keywordstyle=\color{blue},
stringstyle=\color{red},
commentstyle=\color{commentcol}
}
% speciellt "språk" för att visa output
\lstdefinelanguage{none}{
keywords={}
}
% style för att visa output
\lstdefinestyle{text}{
language=none,
numbers=left
}
%\lstset{language=java}
%\lstset{language=c++}
%\lstset{language=python}
\lstset{style=text}
% Presentations. Give a brief presentation of one of the Round-2 candidates
% of the SHA-3 competition,
% http://csrc.nist.gov/groups/ST/hash/sha-3.
% Written Research Presentation Requirements
% As a part of the course you may submit a written summary of a research paper
% in cryptography (see the course description for topics). Presentations not
% adhering to the following requirements may be awarded zero points.
%
% 1. The presentation must be prepared individually.
% 2. The presentation must have 4 pages.
% 3. You may discuss and comment on your presentations in study groups of
% three students, provided that your topics differ. In particular, you
% are allowed to exchange drafts of your presentations within your study
% group and comment on each others writings to improve the final
% versions. You may be involved in a different study group for your
% presentation than those for the homeworks.
%
% Recommendations
% The intended audience for the presentations is at the level of a member of
% our course. Think about the level of detail to present the research. Try to
% stress novel and interesting ideas. Include some final comments giving your
% opinion of the paper. What did you find most interesting?
%
% We strongly recommend that you prepare your presentation at least two weeks
% in advance, and then revise it the days before the deadline (if not sooner).
% This will allow you to see your own errors more clearly.
\begin{document}
\maketitle
%\begin{abstract}
% \noindent
% ...
%\end{abstract}
%\newpage
\section{Introduction}
\noindent
The SHA-3 candidate \emph{Skein} is more than just a hash function to replace
SHA-2, it is a hash function family. Skein can be used as a simple
hash function, which is why it is proposed, but it can also do native tree
hashing, natively be used as a MAC, be used in HMAC, native support for
randomized hashing, native support as a hash function for digital signatures,
natively be used as a key derivation function (KDF) and password-based key
derivation function (PBKDF), natively as a pseudo-random number generator
(PRNG), natively be used as a stream cipher, supports personalization and
finally has native support for different output sizes. All of this because of
the extensive set of options which can be used with Skein.
The Skein family of hash functions has internal states of sizes 256, 512 and
1024, which can be compared to the SHA-256 and SHA-512 hash functions. The
Skein-256 version is a low memory variant which can be implemented using only
about 100 bytes of RAM~\cite{skein}.
The Skein hash function is designed for performance in both software and
hardware, it is based on the tweakable block-cipher \emph{Threefish}, and it
is thanks to that the cipher is tweakable that Skein can support all these
options outlined above.
The core design principle of Threefish is that more simple rounds is more
secure than fewer complex rounds, and thus it uses only three different
mathematical operations: addition, exclusive-or and constant rotations. It is
based on 64-bit words to be optimal for use on modern 64-bit processors.
Note that I will in this presentation interchange bitstrings, bytestrings and
numbers for the sake of brievity. To do this correctly would involve
type-changes from strings to numbers to be able to do arithmetic with them.
However, I will assume that computations with bitstrings and numbers can be
done (using ''automagical'' transformations).
\section{The Idea}
\noindent
The idea of Skein is that it must be able to be a drop-in replacement for
other hash functions such as MD5 and the SHA-variants as well as having some
other extensive functionality. In \cite[Table 1]{skein} there is a table
listing the authors' recommended Skein configurations to use Skein as a drop-in
replacement for MD5 and SHA.
The construct which allows this high configurability of Skein is the tweakable
block-cipher, in this case Threefish. It is the block-size of Threefish, 256,
512 and 1024 bits, which determines the size of the internal state of the hash
function.
The thing used in Skein is that the tweak in Threefish allows for hashing
configuration data along with the message data in every block and thus making
every block unique. The core being the compression function called
\emph{unique block iteration (UBI)}. UBI is a chaining mode which uses
Threefish to compress arbitrary sized data into a fixed size.
(Skein, or rather UBI, would also work using any other tweakable block-cipher.)
The final part of Skein is the optional argument system. The design makes
these three parts of Skein independent of each other, partly because it makes
it easier to prove theorems about the independet systems but also to have
modularability.
\subsection{Threefish}
\noindent
The Threefish cipher is a tweakable block-cipher with block-sizes of 256, 512
or 1024 bits and always a tweak parameter of 128 bits, it works entirely with
64-bit words -- which is why it is optimal for a 64-bit processor.
The encryption function, \(E(K, T, P)\), for Threefish takes three arguments:
the keyi \(K\), the tweak \(T\) and the plaintexti \(P\). More on this in a
moment.
First, the key schedule works as such that you take the 64-bit words in the
original key supplied to the function as well as the two 64-bit words supplied
as the tweak. Subkeys are generated from this key and tweak using only XOR and
addition modulo \(2^{64}\). So the subkeys depends on both the key and the
tweak. (For details, see \cite[p. 11-12]{skein}.)
The encryption, simplified, is done in the following way: the plaintext is
added with a subkey using addition modulo \(2^{64}\) (for details, see
\cite[p. 10]{skein}).
The result is the run through a mixing function which uses addition modulo
\(2^{64}\), XOR and constant rotation according to a predefined table (for
details, see \cite[p. 11]{skein}).
When mixed, the words are permuted according to a predefined permutation (for
details, see \cite[p. 10-11]{skein}).
All these steps are repeated are repeated a total of four times before we add
the next subkey and repeat all this again. This four-step procedure is
repeated 18 times, i.e. a total of 72 mixing and permuting rounds. This
applies to the 256- and 512-bit versions of the cipher, the 1024-bit cipher
uses a total of 80 rounds.
\subsection{Unique Block Iteration (UBI)}
\noindent
Now to the most interesting part of the whole construction -- the compression
function UBI. UBI takes three inputs: a starting value \(G\), a message \(M\)
and a starting value for the tweak \(T_s\). The UBI processes the message in
blocks of size of the intrernal state and uses a tweak value for each block.
The tweak value, a number of size 128-bits, is divided into certain fields.
E.g. \emph{position} (bits 0-95), \emph{tree-level} (bits 112-118),
\emph{bit-pad} (bit 119), \emph{type} (bits 120-125), \emph{first} (bit 126)i
and finally \emph{final} (bit 127). For a better overview, see \cite[Table 5
and Figure 9]{skein}.
The fields are continuously used during the processing of a message. For
instance, if the message is not a multiple of the block-size it must be
padded, if it must be padded the \emph{bit-pad} bit must be set to 1 and
otherwise 0. The \emph{first} bit will be set to 1 for the first block of the
message and 0 otherwise, the \emph{final} bit will be set to 1 for the final
block of the message and 0 otherwise. The \emph{position} bit will be
continuously increased throughout the process and will thus guarantee
uniqueness even if the message is constantly repeating.
When the message-blocks \(M_i\) are processed, it is done accordingly:
\begin{eqnarray}
\nonumber
H_0 &=& G, \\
\nonumber
H_{j+1} &=& E(H_j, T_j, M_j),
\end{eqnarray}
where \(T_j\) is the tweak-value with the bit-fields set correctly for the
\(j\)th block. When all blocks are processed, the final \(H\) is the result of
the UBI chaining mode. This means that each block is processed uniquely, they
all depend on each other and the original message has now been compressed to a
single block-sized result.
\section{Functionality}
\noindent
The \emph{type} field of the UBI tweak is used by Skein to run in different
types of operation. The different type-values are listen in \cite[Table
6]{skein}, but some of them are used to include a key in the hash, e.g. useful
for MAC and KDF, to include the configuration block block in the hash, and so
on. The configuration type is mandatory to run, and they must be run in the
order they are defined in \cite[Table 6]{skein}.
Skein must be run in the following way. First we run the key through the UBI
as \(K' = \mathrm{UBI}(0, K, T_{\mathrm{key}}2^{120})\), where \(K\) is our
key and we must set the 120th bit of the tweak (i.e. the \emph{type} bits) to
the type \(T_{\mathrm{key}}\). If we have no key, we let \(K' = 0\).
After this step comes the configuration. The configuration is done by
\[
G_0 = \mathrm{UBI}(K', C, T_{\mathrm cfg}2^{120}),
\]
where \(C\) is our configuration. The configuration is a 32-byte long
bitstring divided into fields just as the tweak. The contains information such
as schema identifier, in this case the byte-representation of ''SHA3'',
version number, output length, and some settings regarding tree hashing.
Once this configuration step is done we can start the more general part of the
algorithm.
Let \(L\) be a list of 2-tuples \((T_0, M_0), \ldots, (T_{t-1}, M_{t-1})\)
such that \(T_{\mathrm cfg} < T_0\), \(T_i < T_{i+1}\) and
\(T_{t-1} < T_{\mathrm out}\), \(T_{\mathrm out}\) being the type of largest
value. Now, if we do not intend to do any tree hashing\footnote{Tree hashing
is not covered in this presentation, it requires some other processing of the
messages.}, we can start processing this list by letting
\[
G_{i+1} =\mathrm{UBI}(G_i, M_i, T_i2^{120}),
\]
for \(i = 0, 1, 2, \ldots, t-1\). When the list is processed we find our
result \(H\) by
\[
H = \mathrm{concatenate}(\mathrm{UBI}(G_t, 0, T_{out}2^{120}),
\mathrm{UBI}(G_t, 1, T_{out}2^{120}), \ldots)
\]
until the desired output size is reached (as set in configuration \(C\)).
\subsection{Features}
\noindent
Now, all those features mentioned in the introduction, almost every one of
them has their own type defined. By adding an appropriate message and the
corresponding type you do lots of different things with Skein. (Note however
that not all types can be combined.) One interesting type of operation, or
application, is the personalization. This application is used to personalize
the hash, i.e. if you need two different hash functions you can use Skein at
both times and just personalize them differently. The authors \cite[p.
21]{skein} recommends using the following format for a string: \code{20100319
dbosk@kth.se APP/test1} and \code{20100319 dbosk@kth.se APP/test2}, i.e.
today's date, your valid email (this date), the application name and some
functional specifier. The you get two different hash functions using the same
API. This could be useful when hashes are used in more than one place in a
protocol. This is because if a hash is first computed on some data, this hash
might be useful for an attacker because later another hash might be computed
on similar or related data.
The tree hashing is also an interesting functionality. Although, it requires
another algorithm, as mentioned above, which is not covered in this text, the
idea is interesting. The purpose of the tree hashing is to be able to hash
using parallell processing, e.g. on a multi-core processor. It can also be
utilized when rehashing of data is needed, because only parts of the tree must
be rehashed thus also increasing performance.
\section{Security}
\noindent
The security of Skein is proven by the authors in \cite{proofs}, in the actual
specification \cite{skein} they describe the results of the proofs. The
conclusion of the security claims is that Skein is proven to be secure if the
compression function is collision resistant. Also, Skein's property of being a
pseudo-random function (PRF) and secure to use for KDF, MAC, HMAC, PRNG and as
a stream cipher is proven with the assumption that Threefish is a tweakable
pseudo-random permutation. There are some known attacks on Threefish, however
they seem to be quite harmless and Threefish should be about as secure as AES
\cite{skein}.
\section{Conclusion}
\noindent
The paper was a very interesting read. It is a very interesting algorithm
which can be both easily used and easily expanded in many ways, as it is
already prepared for futher expantion. It is also interesting that such a
complex construction can be built by such simple tools. The separation of the
parts, e.g. the tweakable block-cipher and the UBI compression function, makes
it very easy to follow and to get an overview of the function family.
%\newpage
\begin{thebibliography}{9}
\bibitem{skein}
Bellare, M., Kohno, T., Lucks, S., Ferguson, N., Schneier, B.,
Whiting, D., Callas, J. and Walker, J.
\emph{The Skein Hash Function Family},
version 1.2, 15 September 2009.
Specification submitted to NIST SHA-3 competition.
\bibitem{proofs}
Bellare, M., Kohno, T., Lucks, S., Ferguson, N., Schneier, B.,
Whiting, D., Callas, J. and Walker, J.
\emph{Provable Security Support for the Skein Hash Family},
version 1.0, April 2009.
\url{http://www.skein-hash.info/sites/default/files/skein-proofs.pdf}.
\end{thebibliography}
\vspace{1cm}
\end{document}