Files
master_thesis/Chapters/Implementation.tex
nb72soza Bittner 8ae1c0ca95 Update on Overleaf.
2025-05-24 02:20:46 +00:00

203 lines
17 KiB
TeX

% !TeX root = ../Thesis.tex
%************************************************
\chapter{Implementation}\label{ch:implementation}
%************************************************
\glsresetall % Resets all acronyms to not used
% - Goal of this thesis is security analysis using differential testing
% - first idea (naive implementation): use simtrace2 to capture traffic between the LPA (ue) and euicc
% - simtrace2 sends \glspl{apdu} to socket via udp packet -> read data from socket -> analyse apdu command for instruction type
% - save recored traffic to file
% - insert other euicc into pcsc card reader -> replay each apdu to euicc
% - check for differences in the responses
% - problem: rsp uses signed nonces -> can't replay data
% - next idea: implement lpa to perform actions via code -> not rely on manual interaction with esim manufacturer lpa app, manufacturer lpa introduce traffic that is not necessary for the intended action
% - use the lpa to produce traffic for the euicc in the pcsc card reader, but mutate it before sending
% - record the returned status codes and check if different euicc behaves the same (crashes at the same point or returns the same status word)
% - on the slower side -> rsp is stateful and we rely on the sm-dp+ from the profile vendor
% - small problem with apdu mutation: we basically just fuzz the asn1 parser of the euicc sometimes
% - alternative: fuzz valid input data
% - oss-fuzz proposes python hypothesis as a framework for fuzzing via python
% - python hypothesis: property based testing library -> we define input structure and hypothesis produces data that is valid for the given structure
% - tests for edge cases
% - in the following sections i will go into details on how each implementation work
The primary goal of this thesis is to conduct a security analysis of commercial \gls{esim} implementations using differential testing. The underlying idea of this approach is to systematically compare the behavior of different \gls{euicc} implementations under the same inputs to detect inconsistencies or vulnerabilities.
\paragraph{Initial Naive Approach}
The first implementation was based on a straightforward observation setup using the \texttt{simtrace2} tool. \texttt{simtrace2}~\cite{osmocom_simtrace_nodate} allows monitoring of communication between a physical device (typically a smartphone acting as the \gls{lpa}) and a \gls{sim} card. The tool captures \glspl{apdu} and forwards them via \gls{udp} packets to a local socket. From this socket, the \gls{apdu} data can be read, parsed, and analyzed.
The proposed method was to:
\begin{enumerate}
\item Record the \gls{apdu} traffic between the \gls{lpa} and the \gls{euicc} during an \gls{rsp} session.
\item Store this traffic in a structured format.
\item Replace the original \gls{euicc} with another one inserted into a \gls{pcsc}-compatible card reader.
\item Replay each recorded \gls{apdu} and monitor the response.
\end{enumerate}
The goal was to detect behavioral differences, such as differing \glspl{sw} or execution failures. However, this method proved infeasible in practice due to the nature of the \gls{rsp} protocol: many operations are cryptographically bound to the specific session using signed nonces, meaning that replaying recorded traffic is not possible.
\paragraph{Controlled LPA Implementation}
To overcome the limitations of passive traffic replay, a new strategy was developed. Rather than relying on the proprietary \gls{lpa} applications often provided by \gls{esim} vendors, we implemented our own minimal \gls{lpa}. The motivation behind this was twofold:
\begin{itemize}
\item Vendor \glspl{lpa} often introduce extraneous or undocumented traffic unrelated to the provisioning process, which complicates analysis.
\item A custom \gls{lpa} allows for controlled mutation and injection of \gls{apdu} sequences.
\end{itemize}
The implemented \gls{lpa} performs a target operation (e.g., profile download or enablement) by issuing the appropriate command sequence to the \gls{euicc} in the \gls{pcsc} card reader. Before sending, \glspl{apdu} can be programmatically mutated to evaluate robustness of the implementation against malformed or unexpected inputs. The \gls{lpa} records returned status words and checks for behavioral consistency across different \glspl{euicc}.
While this approach allows for a more precise control, it has some drawbacks. \gls{rsp} is a stateful protocol, and provisioning actions rely on interaction with the profile vendor's \gls{smdpp} server. Consequently, execution speed is constrained by network latency and backend responsiveness as well as restoring the \gls{euicc} state after a reset.
\paragraph{Fuzzing Strategy}
A challenge in mutating \gls{apdu} messages is that random mutations often lead to invalid \gls{asn1} structures. This effectively reduces the testing strategy to fuzzing the \gls{asn1} decoder, which is only a small part of the \gls{euicc} logic. To increase test effectiveness, the implementation shifted toward fuzzing \textit{valid structured input} rather than arbitrary byte sequences.
To support structured data fuzzing, this thesis uses the Python-based \texttt{hypothesis} library, which implements property-based testing~\cite{maciver_hypothesis_2019}. \texttt{hypothesis} allows definition of input schemas that mirror \gls{asn1} structures used in \gls{esim} protocols. From these schemas, it automatically generates valid input data covering a wide range of edge cases.
This strategy enables testing of:
\begin{itemize}
\item Field boundary conditions (e.g., maximum tag lengths).
\item Rare but valid combinations of optional elements.
\item Complex nesting of \gls{tlv} structures.
\end{itemize}
In the following sections, the technical details of each implementation component, including the \gls{lpa} logic, mutation framework, and fuzzing harness, are presented.
\section{Tracing}
\label{sec:tracing}
% functions:
% - trace traffic from the simtrace2, map the traffic to function calls i.e. identify which function the call handles, record the traced traffic
% - replay: replay the previously recorded traffic to euicc in pcsc reader, check for differences in responses
% parts:
% - pcsc_link: wrapper for the python smartcard library, handles session establishment to reader, and apdu/tpdu transmission, automatically handles requesting of available data i.e. status word 61XX
% - card: represents card in the pcsc card reader, identifies card type (i.e sgp22, sgp.22 test, normal sim, etc) and which applications are installed (ISDR, ECASD, etc), used to send \glspl{apdu} to pcsc card through pcsc link
% - tracer: dummy implementation of card for instruction interpretation and apdu parsing, uses pysim gsmtap as apdu source
% - recorder: handles tracer thread and recording of \glspl{apdu}, starts tracer main thread (continously listens for new \glspl{apdu} from gsmtap until timeout is reached or canceld by user) and records apdu to recording, has target isd-r as argument
% - recording: represents a list of recorded \glspl{apdu}, handles source and target isd-r addresses, file saving and loding as well as checking if the file is replayable
% - replay: establishes connection to pcsc via pcsc link, loads recorded \glspl{apdu} and sends them over the link to the connected euicc, switches out source isd-r and target isd-r during replay, compares response status word to recorded status word on prints an error if there is a difference
The tracing component is responsible for capturing, interpreting, and replaying \glspl{apdu} communication between an \gls{lpa} (or other source) and the \gls{euicc}. This forms the foundation of the differential testing framework by allowing the same interaction sequence to be executed across multiple \glspl{euicc} for behavioral comparison.
The tracing functionality comprises two main operations:
\begin{itemize}
\item \textbf{Tracing and recording:} Captures \glspl{apdu} traffic from a physical interface using \texttt{simtrace2}~\cite{osmocom_simtrace_nodate} and associates it with functional interpretations (e.g., profile enablement, deletion). The \glspl{apdu} are parsed and stored along with contextual information such as sender and receiver addresses.
\item \textbf{Replaying:} Replays previously recorded \glspl{apdu} sequences to an \gls{euicc} in a \gls{pcsc} card reader. It replaces context-specific identifiers and checks for discrepancies in response behavior.
\end{itemize}
\begin{figure}[h!]
\includesvg[width=\textwidth]{Graphics/trace_setup.svg}
\caption{Tracing lab setup}
\label{img:trace_setup}
\end{figure}
The implementation consists of several key components:
\begin{description}
\item[\texttt{PcscLink}] A thin wrapper over the Python \texttt{pyscard} library~\cite{rousseau_pyscard_2025}, which abstracts away low-level communication with \gls{pcsc}-compatible card readers. It handles session establishment, \glspl{apdu}/\gls{tpdu} transmission, and automatic processing of status words such as \texttt{61XX} (i.e., triggering \texttt{GET RESPONSE} when necessary).
\item[\texttt{Card}] Represents a connected card in a \gls{pcsc} reader. It queries the card to determine its type (e.g., standard \gls{sim}, test \gls{euicc}, or commercial \gls{euicc}), and identifies installed applications such as \texttt{\gls{isdr}} or \texttt{\gls{ecasd}}. The class serves as the interface for sending \glspl{apdu} to the card through the \texttt{pcsc\_link}.
\item[\texttt{Tracer}] A dummy implementation of the \texttt{Card} interface used during passive tracing. It parses incoming \glspl{apdu} from the \gls{gsmtap} interface using \texttt{pysim} and attempts to classify them based on instruction type. This allows mapping observed \glspl{apdu} to functional operations.
\item[\texttt{Recorder}] Coordinates tracing and recording. It spawns a separate tracer thread that listens for \glspl{apdu} from \gls{gsmtap} in a loop until a timeout occurs or a stop signal is issued. \glspl{apdu} are recorded alongside the designated target \texttt{\gls{isdr}} for later analysis.
\item[\texttt{recording}] An abstraction for a recorded session. It stores the list of \glspl{apdu}, associated source and target \texttt{\gls{isdr}} addresses, and metadata. It provides serialization functions for saving to and loading from disk, as well as validity checks to determine whether a recording is replayable.
\item[\texttt{replay}] Loads a saved \texttt{recording}, connects to the target \gls{euicc} via \texttt{PcscLink}, and replays each \glspl{apdu}. During replay, the source and target \texttt{\gls{isdr}} values are automatically substituted. The response status words from the target \gls{euicc} are compared against those from the original trace. Any mismatch is reported to highlight divergent behavior.
\end{description}
This modular structure allows for easy integration into both automated test pipelines and manual inspection tools, and lays the groundwork for both mutation-based and structure-aware fuzzing techniques described in subsequent sections.
\section{LPA}
\label{sec:lpa}
% due to the limitations of the tracer to replay rsp correctly -> need for lpa to execute interaction with euicc with valid input
% lpa handles communication over different interfaces as defined in sgp22
% we are using sgp v3.1 -> newest version at the date of writing this thesis
% lpa implementation consists of different parts
% card
% represents euicc that is currently inserted into the pcsc card reader
% once created: starts scanning for supported applications on the card
% checks which application responds for which class, instruction code, and adf
% adf is important: esim on sim applications of deviate from the common adf that is proposed in the sgp22, application implementations contain multiple known adfs -> card selects the one that is used by the euicc
% handles application selection and keeps track of the currently selected application to prevent reselection or unnecessary traffic
% pcsc link
% uses the pySim LinkBaseTpdu class as base
% once initialised it uses given pcsc card reader to establish an exclusive connection to the card reader
% esclusive connection: euicc have a state -> we would loose state if other cards could perform file sections etc in between -> no shared connection with other programs
% during the connection process a few steps happen:
% - check which protocol is supported T=0 or T=1
% - establish connection via given protocol and check for errors
% link can be established via a python context manager -> automatically close connection once context is exited
% handles apdu transmission and tpsu transmission
% apdu transmission also handles some return codes
% - 9FXX, 61XX, 62XX, 63XX: automatically request availble response bytes -> reponse bytes are autmatically attached to orignal r-apdu -> to the caller it appears as one apdu even though in the background multiple \glspl{apdu} were send
% before sending \glspl{apdu} it may also perform mutation by calling the optional mutation engine (ref mutation engine section in apdu fuzzing)
% as well as records the \glspl{apdu} (ref apdu fuzzing section)
% application
% represents euicc application like isd-r, ecasd etc and implements application specific functionality, also handles apdu communication with pcsc link for application related traffic i.e store_data command
% has main/common adf address and multiple aliases which are used by different vendors for esim implementations
% ADFs for eSIM on SIM applications that we had access to
% using simtrace2 and vendor lpa traffic to find out which adf was used for the isd-r
% Common isd-r adf: A0000005591010FFFFFFFF8900000100
% 5Ber.esim: A0000005591010FFFFFFFF8900050500
% Xesim: A0000005591010FFFFFFFF8900000177
% esim.me: A0000005591010000000008900000300
% current implementations consist of isd-r/isd-p, estk_fwupd
% applications communicate to card via pcsc link
% application functions trigger store_data command -> internally uses asn1tools to encode and decode data
% once data is decoded: application functions use response data specific data classes to parse and validate data
% these data classes use pydantic for serialization and deserialization as well as decoding and encoding of data -> easier to handle base64 encoded data which often is returned by the smdp+ as well as decode special data such as bitstrings, hex strings as well as version types
% implemented using custom decoders and encoders -> makes it easier to read data
% bit strings: chain of bits where each bit represents a function or piece of data that is given or not given -> pydantic serializer mixin makes it easier to represent this information in code and also makes it easier to use this information (i.e when used as library) to check whether a bit is set or not
% order: data dict -> store_data -> use asn1tools to encode -> build apdu -> send apdu -> decode return data -> parse and decode with data class
%isd-r implementation handles all rsp related functions
% estk_fwupd: implements the propriatary estk update mechanism
% this was reverse engineered and is further explained in the findings section (ref findings section)
% can return currently installed firmware version, unlock the euicc to accept and new firmare, install new binary
% footnote: unlocking the euicc differs from the card unlocking functionality (cite global platform specs which define unlocking) which allows the use of gp commands to for example install new java card applets
%note: maybe explain store_data command in more detail i.e apdu splitting -> indicate that more data follows
% exception handling
% sgp22 defines a possible errors for interactions
% euicc returns error code -> user has to know which kind of error was triggered
% for us its rather important to exactly know which errors were triggered not only to know which of those errors are expected but also to know what went wrong
% exception handling triggers exact exception based on error code (insert code listing which defines errors and raises them based on the exception)
% smdp client
% lpa not only handles communication to euicc but also to the smdp+ server -> client side implementation is necessary -> defined as es9+ interface in sgp22
% uses httpx as base for http communication
% sgp22 defines that the header should indicate the supported rsp version
% { "Content-Type": "application/json", "User-Agent": "gsma-rsp-lpad", "X-Admin-Protocol": "gsma/rsp/v3.1.0" }
% server only accepts json data in body (cite sgp22 definition) -> each value of the key/value pair is base64 encoded
% pydantic is used for deserialization of response data
% before returning the data to the caller -> client checks for error on server and eventually raises the corresponding exception -> as explained in the exception handling part
% smdp+ client is mostly used by the isd-r
\section{Fuzzing}
\label{sec:fuzzing}
\subsection{Data Fuzzing}
\label{subsec:data_fuzzing}
\subsection{APDU Fuzzing}
\label{subsec:apdu_fuzzing}
\section{CLI}
\label{sec:cli}