Update on Overleaf.

2026-02-04 11:07:43 +00:00 · 2025-07-04 23:19:24 +00:00
parent e4419c403d
commit b6be91e4d9
7 changed files with 630 additions and 571 deletions
--- a/Chapters/Implementation.tex
+++ b/Chapters/Implementation.tex
@@ -23,9 +23,9 @@
 % - tests for edge cases
 % - in the following sections i will go into details on how each implementation work

-The primary goal of this thesis is to conduct a security analysis of commercial \gls{esim} implementations using differential testing. The underlying idea of this approach is to systematically compare the behavior of different \gls{euicc} implementations under the same inputs to detect inconsistencies or vulnerabilities. The focus lies particularly on components and behaviors that differentiate traditional \gls{sim} cards from \glspl{esim}, such as profile download and profile mangement capabilites.
+The primary goal of this thesis is to conduct a security analysis of commercial \gls{esim} implementations through differential testing. We adopt a systematic approach to compare the behavior of different \gls{euicc} implementations under identical inputs to uncover inconsistencies and potential vulnerabilities. Our focus lies particularly on components and behaviors that differentiate traditional \gls{sim} cards from \glspl{esim}, such as profile download and profile mangement capabilites.

-Differential testing is applied through structured fuzzing, using both valid and mutated \gls{apdu} sequences. By observing how different \glspl{euicc} respond to identical input, the approach aims to uncover deviations that may indicate security flaws or implementation weaknesses.
+To perform differential testing, we designed a structured fuzzing methodology that employs both valid and mutated \gls{apdu} sequences. By observing and comparing how multiple \glspl{euicc} respond to the same inputs, we aim to uncover deviations that may indicate security flaws or implementation weaknesses.

 \section{Design}

@@ -33,9 +33,9 @@ This section presents the step-by-step refinement of the testing strategy. The i

 \paragraph{Initial Naive Approach}

-The first implementation was based on a straightforward observation setup using the \texttt{simtrace2} tool. \texttt{simtrace2}~\cite{osmocom_simtrace_nodate} allows monitoring of communication between a physical device (typically a smartphone acting as the \gls{lpa}) and a \gls{sim} card. The tool captures \glspl{apdu} and forwards them via \gls{udp} packets to a local socket. From this socket, the \gls{apdu} data can be read, parsed, and analyzed.
+We first implemented a simple observation setup using the \texttt{simtrace2} tool. \texttt{simtrace2}~\cite{osmocom_simtrace_nodate} allows monitoring of communication between a physical device (typically a smartphone acting as the \gls{lpa}) and a \gls{sim} card. The tool captures \glspl{apdu} and forwards them via \gls{udp} packets to a local socket. From there, we parsed and analyzed the \gls{apdu} data.

-The proposed method was to:
+Our proposed methodology involved the following steps:
 \begin{enumerate}
    \item Record the \gls{apdu} traffic between the \gls{lpa} and the \gls{euicc} during an \gls{rsp} session.
    \item Store this traffic in a structured format.
@@ -43,41 +43,37 @@ The proposed method was to:
    \item Replay each recorded \gls{apdu} and monitor the response.
 \end{enumerate}

-The goal was to detect behavioral differences, such as differing \glspl{sw} or execution failures. However, this method proved infeasible in practice due to the nature of the \gls{rsp} protocol: many operations are cryptographically bound to the specific session using signed nonces, meaning that replaying recorded traffic is not possible.
+The goal was to detect behavioral differences, such as differing \glspl{sw} or execution failures. However, we discovered that this method was impractical in real-world scenarios. Due to the nature of the \gls{rsp} protocol, many operations involve cryptographic bindings using session-specific nonces, rendering traffic replay infeasible.

 \paragraph{Controlled LPA Implementation}

- To overcome the limitations of passive traffic replay, a new strategy was developed. Rather than relying on the proprietary \gls{lpa} applications often provided by \gls{esim} vendors, we implemented our own minimal \gls{lpa}. The motivation behind this was twofold:
+To address the limitations of passive traffic replay, we developed our own minimal and controllable \gls{lpa}. Instead of relying on proprietary \gls{lpa} applications supplied by \gls{esim} vendors, we opted to implement a custom solution for two key reasons:

 \begin{itemize}
    \item Vendor \glspl{lpa} often introduce extraneous or undocumented traffic unrelated to the provisioning process, which complicates analysis.
    \item A custom \gls{lpa} allows for controlled mutation and injection of \gls{apdu} sequences.
 \end{itemize}

-The implemented \gls{lpa} performs a target operation (e.g., profile download or enablement) by issuing the appropriate command sequence to the \gls{euicc} in the PC/SC card reader. Before sending, \glspl{apdu} can be programmatically mutated to evaluate robustness of the implementation against malformed or unexpected inputs. The \gls{lpa} records returned status words and checks for behavioral consistency across different \glspl{euicc}.
+The implemented \gls{lpa} performs a target operation (e.g., profile download or enablement) by issuing the appropriate command sequence to the \gls{euicc} in the PC/SC card reader. Prior to transmission, we programmatically mutate \glspl{apdu} to test the implementation’s robustness against malformed or unexpected input. We then record the resulting status words and assess behavioral consistency across different \gls{euicc} devices.

-While this approach allows for a more precise control, it has some drawbacks. \gls{rsp} is a stateful protocol, and provisioning actions rely on interaction with the profile vendor's \gls{smdpp} server. Consequently, execution speed is constrained by network latency and backend responsiveness as well as restoring the \gls{euicc} state after a reset.
+While our approach allows for a more precise control, it has some drawbacks. \gls{rsp} is a stateful protocol, and provisioning actions rely on interaction with the profile vendor's \gls{smdpp} server. Consequently, execution speed is constrained by network latency and backend responsiveness as well as restoring the \gls{euicc} state after a reset.

 \paragraph{Fuzzing Strategy}

-A challenge in mutating \gls{apdu} messages is that random mutations often lead to invalid \gls{asn1} structures. This effectively reduces the testing strategy to fuzzing the \gls{asn1} decoder, which constitutes only a small component of the overall \gls{euicc} logic. While this approach can reveal vulnerabilities in the \gls{asn1} parser, especially given that parsing vulnerabilities in \gls{asn1}-based decoders have historically led to critical security issues \cite{mitre_cve_2003, nist_nvd_2024, nist_nvd_2025}, it tends to produce limited coverage of the higher-level application logic implemented in the card.
+When applying mutations to \gls{apdu} messages, we encountered a common issue: random mutations frequently produce invalid \gls{asn1} structures. This narrows the testing focus to the \gls{asn1} decoder, which represents only a small portion of the total \gls{euicc} logic. Despite this limitation, fuzzing at the decoding layer can still yield valuable results, as parsing flaws in \gls{asn1}-based decoders have historically led to critical vulnerabilities~\cite{mitre_cve_2003, nist_nvd_2024, nist_nvd_2025}.

-Nonetheless, the effectiveness of fuzzing the \gls{asn1} parser layer should not be underestimated. Invalid or malformed inputs may still expose critical flaws, such as memory corruption or improper bounds checking within parser implementations. Consequently, early-stage fuzzing using random or deterministic byte-level mutations can serve as a useful baseline for robustness testing at the decoding boundary.
+To improve the depth and scope of our fuzzing efforts, we adapted our implementation to generate and mutate structurally valid input instead. By preserving the syntactic and semantic correctness of \gls{asn1} structures, we enabled the fuzzer to exercise deeper layers of application logic. This allowed us to test state transitions, logical constraints, and error handling mechanisms that would otherwise remain untriggered by malformed data.

-To broaden the scope and increase the effectiveness of the fuzzing strategy, the implementation was adapted to focus on generating and mutating \textit{structurally valid input} instead. By preserving the syntactic and semantic integrity of the underlying \gls{asn1} structures, the fuzzer is able to explore deeper application logic paths beyond the decoder. This allows for a more comprehensive evaluation of the \gls{euicc} system, including internal state transitions, logical constraints, and error handling routines that are only triggered in the presence of valid but semantically diverse \glspl{apdu}.
+To support this structured fuzzing approach, we integrated the Python-based \texttt{hypothesis} library, which provides property-based testing capabilities~\cite{maciver_hypothesis_2019}. Using \texttt{hypothesis}, we defined input schemas mirroring the \gls{asn1} structures employed in the SGP.22 specification~\cite{gsma_sgp22_2025}. The framework then automatically generates valid input covering a wide range of edge cases.

-To support structured data fuzzing, this thesis uses the Python-based \texttt{hypothesis} library, which implements property-based testing~\cite{maciver_hypothesis_2019}. \texttt{hypothesis} allows definition of input schemas that mirror \gls{asn1} structures used in \gls{esim} protocols. From these schemas, it automatically generates valid input data covering a wide range of edge cases.
-
-This strategy enables testing of:
+With this setup, we were able to test:
 \begin{itemize}
    \item Field boundary conditions (e.g., maximum tag lengths).
    \item Rare but valid combinations of optional elements.
    \item Complex nesting of \gls{tlv} structures.
 \end{itemize}

-In the following sections, the technical details of each implementation component, including the \gls{lpa} logic, mutation framework, and fuzzing harness, are presented.
-
-
+In the following sections, we present the technical implementation details of our \gls{lpa} logic, input mutation framework, and fuzzing harness.

 \section{Tracing}
 \label{sec:tracing}
@@ -95,7 +91,7 @@ In the following sections, the technical details of each implementation componen

 We built the tracing component to capture and interpret \glspl{apdu} exchanged between an \gls{lpa} (or other source) and the \gls{euicc}, and to replay them by inserting the recorded \glspl{apdu} into the communication between the \gls{lpa} and the \gls{euicc}. This forms the foundation of the differential testing framework by allowing the same interaction sequence to be executed across multiple \glspl{euicc} for behavioral comparison.

-The tracing functionality comprises two main operations:
+Our tracing functionality comprises two main operations:

 \begin{itemize}
    \item \textbf{Tracing and recording:} Captures \glspl{apdu} traffic from a physical interface using \texttt{simtrace2}~\cite{osmocom_simtrace_nodate} and associates it with functional interpretations (e.g., profile enablement, deletion). The \glspl{apdu} are parsed and stored along with contextual information such as sender and receiver addresses.
@@ -124,7 +120,7 @@ The implementation consists of several key components:
    \item[\texttt{replay}] Loads a saved \texttt{recording}, connects to the target \gls{euicc} via \texttt{PcscLink}, and replays each \glspl{apdu}. During replay, the source and target \texttt{\gls{isdr}} values are automatically substituted. The response status words from the target \gls{euicc} are compared against those from the original trace. Any mismatch is reported to highlight divergent behavior.
 \end{description}

-This modular structure allows for easy integration into both automated test pipelines and manual inspection tools, and lays the groundwork for both mutation-based and structure-aware fuzzing techniques described in subsequent sections.
+This modular structure allows for easy integration into both automated test pipelines and manual inspection tools, and lays the groundwork for both our mutation-based and structure-aware fuzzing techniques described in subsequent sections.


 \section{LPA}
@@ -199,7 +195,7 @@ This modular structure allows for easy integration into both automated test pipe
 % before returning the data to the caller -> client checks for error on server and eventually raises the corresponding exception -> as explained in the exception handling part
 % smdp+ client is mostly used by the isd-r

-Due to the limitations of the \texttt{tracer} implementation in correctly replaying \gls{rsp} interactions, we developed a dedicated \gls{lpa} implementation to initiate valid interactions with the \gls{euicc}. This enables the controlled generation and mutation of valid traffic which we will further explain in \cref{sec:fuzzing}. Our implementation targets the SGP.22 v3.1 specification, which was the latest version available at the time of writing \cite{gsma_sgp22_2025}.
+Due to the inability of the \texttt{tracer} implementation to accurately replay \gls{rsp} interactions, we developed a dedicated \gls{lpa} to initiate valid interactions with the \gls{euicc}. This custom \gls{lpa} provides us with full control over the generation and mutation of traffic, enabling structured and repeatable interaction patterns. We describe the mutation and fuzzing strategies enabled by this setup in detail in \cref{sec:fuzzing}. Our implementation specifically targets the SGP.22 v3.1 specification, which, at the time of writing, represented the most recent version available~\cite{gsma_sgp22_2025}.

 The \gls{lpa} is composed of multiple components:

@@ -225,9 +221,9 @@ Known \glspl{adf} for \gls{isdr} observed during analysis:
  \item esim.me: \texttt{A0000005591010000000008900000300}
 \end{itemize}

-The decoded response data is further processed we use \texttt{pydantic} data classes. \texttt{pydantic}~\cite{colvin_pydantic_2025} is a python library that enable structured parsing of values including Base64-encoded strings, bitfields, version types, and more. We implemented custom encoders/decoders to simplify readability and downstream data processing. For bit fields, a mixin is used to allow checking for specific feature flags via simple accessors.
+To decoded response data for further processing, we use \texttt{pydantic} data classes. \texttt{pydantic}~\cite{colvin_pydantic_2025} is a python library that enabled us to implement structured parsing of values including Base64-encoded strings, bitfields, version types, and more. We implemented custom encoders/decoders to simplify readability and downstream data processing. For bit fields, a mixin is used to allow checking for specific feature flags via simple accessors.

-The \texttt{estk\_fwupd} application implements a proprietary firmware update interface, which we reverse-engineered (see \cref{sec:eval_tracing}). It supports reading the current firmware version, unlocking\footnote{This unlocking is distinct from \gls{gp}-defined unlocking, which allows the execution of generic \gls{gp} commands. See \gls{gp} Card Specification.} the \gls{euicc} for updates, and installing new binaries.
+The \texttt{estk\_fwupd} application implements a proprietary firmware update interface, which we reverse-engineered (see \cref{sec:eval_tracing}). It supports reading the current firmware version, unlocking\footnote{This unlocking is distinct from \gls{gp}-defined unlocking, which allows the execution of generic \gls{gp} commands. See \gls{gp} Card Specification \cite{globalplatform_gp_2018}.} the \gls{euicc} for updates, and installing new binaries.

 \paragraph{Exception Handling}
 The SGP.22 standard defines a variety of response codes and error conditions. We map these response codes to custom exception classes in the \gls{lpa} implementation to enable precise error handling. This is essential for both debugging and for the differential testing framework to reason about diverging behavior across implementations. A code listing of the exception handling mappings is provided in \cref{sec:exception-handling}.
@@ -242,9 +238,9 @@ In addition to \gls{euicc} communication, the \gls{lpa} implementation must inte
 }
 \end{lstlisting}

-Payload values are Base64-encoded as required by the specification. Response data is deserialized using \texttt{pydantic}. Error responses from the server trigger the appropriate exception, as explained previously.
+We encode payload values in Base64 format, as mandated by the specification. To process server responses, we deserialize the returned data using custom \texttt{pydantic} data classes that model the expected structure. In the event of an error response, our implementation raises the appropriate exception, following the error-handling logic outlined in the previous section.

-The \gls{smdpp} client is primarily used by the \gls{isdr} application to execute \gls{rsp}-related functionality.
+The \gls{smdpp} client is primarily used by our \gls{isdr} application to execute \gls{rsp}-related functionality.

 \section{Fuzzing}
 \label{sec:fuzzing}
@@ -332,9 +328,9 @@ To uncover behavioral differences between \gls{euicc} implementations, we implem

 \subsubsection*{Fuzzing Scenarios and Execution}

-Fuzzing is conducted through predefined \emph{scenarios}—sequences of function calls that operate on the \gls{euicc}. Each function in a scenario interacts with the \gls{euicc} through the \gls{lpa} and is subject to mutation. The scenario runner initiates a fresh PC/SC link, resets the card into a clean state (processing all notifications and performing a full memory reset) by calling the \texttt{eUICCMemoryReset} function using our \gls{lpa} implementation, and executes each function with multiple mutations.
+We perform fuzzing through predefined \emph{scenarios}, which consist of ordered sequences of function calls targeting the \gls{euicc}. Each function within a scenario is executed via our custom \gls{lpa} implementation and serves as a potential mutation point. To ensure a consistent test environment, the scenario runner establishes a fresh PC/SC connection and resets the card into a clean state by invoking the \texttt{eUICCMemoryReset} operation. This includes processing all pending notifications and performing a full memory wipe prior to execution.

-This process is guided by an \textbf{operation recorder} that tracks each function call, applied mutations, and resulting responses in a structured \emph{mutation tree}. Each tree node represents a specific function call executed with one type of mutation. A tree level corresponds to a function in the scenario and sibling nodes represent different mutations of that function.
+To systematically track the fuzzing process, we developed an \textbf{operation recorder} that tracks every function invocation, the applied mutations, and the corresponding responses. This data is structured as a hierarchical \emph{mutation tree}, where each node represents a function call with a specific mutation applied. Each level in the tree corresponds to a function in the scenario, while sibling nodes denote alternative mutations of the same function. 

 \subsubsection*{Mutation Engine}
 \label{subsubsec:mutation_engine}
@@ -370,6 +366,7 @@ Figure \cref{fig:scenario_flow} illustrates the \gls{apdu} fuzzing workflow, whi
 \begin{figure}
 	\centering
    \input{Graphics/record_scenario_flow.tikz}
+    % \resizebox{\textwidth}{!}{\input{Graphics/record_scenario_flow.tikz}}
    \caption{Flow for recording a scenario.}
    \label{fig:scenario_flow}
 \end{figure}
@@ -403,7 +400,7 @@ The decision process for selecting the next mutation to apply is a key component
    \label{fig:next_mutation_flow}
 \end{figure}

-The algorithm, illustrated in \cref{fig:next_mutation_flow}, operates based on the current node in the mutation tree. Each node represents a function invocation, and its children represent the same invocation with different mutations. The logic proceeds as follows:
+Our algorithm, illustrated in \cref{fig:next_mutation_flow}, operates based on the current node in the mutation tree. Each node represents a function invocation, and its children represent the same invocation with different mutations. The logic proceeds as follows:

 \begin{enumerate}
  \item \textbf{Check for untried mutations at the current node:}  
@@ -562,14 +559,32 @@ def test_get_profiles(self, use_iccid, profile_class, tags):
 This approach preserves the semantics and structure of the expected \gls{asn1} types while still allowing a wide variety of edge cases to be exercised.

 \paragraph{Implementation Scope}
-Due to reliance on external infrastructure, such as the \gls{smdpp} server, our fuzzing campaign focuses exclusively on the \gls{euicc}-side of the \gls{rsp} protocol. Fuzzing requests directed at the \gls{smdpp} would lead to excessive traffic and could be misinterpreted as \gls{dos} attempts. Therefore, we restrict our tests to those functions defined in the ES10a, ES10b, and ES10c interfaces of the SGP.22 specification, which form the communication layer between the \gls{lpa} and the \gls{euicc}, specifically focusing on functions that accept structured input arguments and directly interact with the \gls{euicc}.
+Due to reliance on external infrastructure for the \gls{rsp} process, such as the \gls{smdpp} server, our fuzzing campaign focuses exclusively on the \gls{euicc}-side of the \gls{rsp} protocol. Fuzzing requests directed at the \gls{smdpp} would lead to excessive traffic and could be misinterpreted as \gls{dos} attempts. Therefore, we restrict our tests to those functions defined in the ES10a, ES10b, and ES10c interfaces of the SGP.22 specification, which form the communication layer between the \gls{lpa} and the \gls{euicc}, specifically focusing on functions that accept structured input arguments and directly interact with the \gls{euicc}.


 Specifically, we implemented fuzzing tests for the following functions:
 \begin{itemize}
-    \item \textbf{ES10a:} \texttt{SetDefaultDpAddress}
-    \item \textbf{ES10b:} \texttt{PrepareDownload}, \texttt{LoadBoundProfilePackage}, \texttt{AuthenticateServer}
-    \item \textbf{ES10c:} \texttt{GetProfileInfo}, \texttt{EnableProfile}, \texttt{DisableProfile}, \texttt{DeleteProfile}, \texttt{eUICCMemoryReset}, \texttt{SetNickname}
+    \item \textbf{ES10a:}
+    \begin{itemize}
+        \item \texttt{SetDefaultDpAddress}
+    \end{itemize}
+    
+    \item \textbf{ES10b:}
+    \begin{itemize}
+        \item \texttt{PrepareDownload}
+        \item \texttt{LoadBoundProfilePackage}
+        \item \texttt{AuthenticateServer}
+    \end{itemize}
+    
+    \item \textbf{ES10c:}
+    \begin{itemize}
+        \item \texttt{GetProfileInfo}
+        \item \texttt{EnableProfile}
+        \item \texttt{DisableProfile}
+        \item \texttt{DeleteProfile}
+        \item \texttt{eUICCMemoryReset}
+        \item \texttt{SetNickname}
+    \end{itemize}
 \end{itemize}

 \paragraph{Fuzzing Lifecycle}