Update on Overleaf.

2026-02-04 03:07:43 +00:00 · 2025-07-14 23:39:17 +00:00
parent 19fc98c9ac
commit bc0d25ba87
8 changed files with 300 additions and 103 deletions
--- a/Chapters/Design.tex
+++ b/Chapters/Design.tex
@@ -93,7 +93,7 @@ The first design focuses on capturing and replaying real interaction sequences b
 \section{Design 2: APDU Fuzzing}
 \label{subsec:design_2}

-The second design focuses on exploring the input space of the \gls{euicc} \gls{rdp} protocol stack by mutating valid \glspl{apdu}. The aim is to test robustness against malformed, unexpected, or edge-case inputs and to expose implementation-level inconsistencies.
+The second design focuses on exploring the input space of the \gls{euicc} \gls{rsp} protocol stack by mutating valid \glspl{apdu}. The aim is to test robustness against malformed, unexpected, or edge-case inputs and to expose implementation-level inconsistencies.

 \paragraph{Design Rationale.} While real traces offer insight into typical usage, they often fail to reveal vulnerabilities related to invalid inputs. \gls{apdu} fuzzing is essential for testing the correctness of error handling and boundary enforcement.

--- a/Chapters/Evaluation.tex
+++ b/Chapters/Evaluation.tex
@@ -128,7 +128,7 @@ The \textbf{9esim v2} card, in comparison, fully supports both \gls{isdr} access

 Among all evaluated \glspl{esim}, \texttt{estk.me} stands out due to its publicly available firmware update utility, which is offered via their official website. This binary executable provides both the latest firmware images and the means to apply updates directly from a host machine, making it a unique case in comparison to other \gls{esim} vendors, none of whom publicly expose firmware updates or custom flashing endpoints. 

-\subsubsection*{Firmware Structure and Analysis}
+\paragraph{Firmware Structure and Analysis.}

 The firmware image accompanying the update utility appears to be encrypted or obfuscated. An entropy analysis conducted using the Shannon entropy metric indicates a consistently high entropy across all tested firmware files, suggesting the presence of encryption or compression. For instance, the entropy of the T001V06 firmware image was measured at approximately \texttt{7.998}, which is close to the theoretical maximum of \texttt{8.0} for purely random data.

@@ -176,7 +176,7 @@ The firmware image accompanying the update utility appears to be encrypted or ob

 A deeper static analysis using Ghidra~\cite{nsa_ghidra_2025} did not reveal any recognizable structure or file headers, further supporting the assumption of encryption. Similarly, tools like Binwalk~\cite{refirmlabs_binwalk_2025} did not detect known compression schemes, embedded file systems, or file signatures. Consequently, firmware payload analysis could not be meaningfully performed beyond block-level transmission.

-\subsubsection*{Firmware Update Mechanism}
+\paragraph{Firmware Update Mechanism.}

 The update mechanism exposes two primary functions via a custom \gls{aid} endpoint:
 \begin{itemize}
@@ -184,7 +184,7 @@ The update mechanism exposes two primary functions via a custom \gls{aid} endpoi
    \item \texttt{flash\_firmware}: performs the actual firmware flashing process.
 \end{itemize}

-The \gls{aid}  used to access the update utility differs based on firmware generation. For example, the test card (generation T001) uses the \gls{aid}:
+The \gls{aid} used to access the update utility differs based on firmware generation. For example, the test card (generation T001) uses the \gls{aid}:
 \begin{quote}
    \centering
    \texttt{A06573746B6D65FFFFFFFF6677757064} \\
@@ -192,7 +192,9 @@ The \gls{aid}  used to access the update utility differs based on firmware gener
 \end{quote}
 Firmware versions follow the format \texttt{TXXXVXX}, where major generation (\texttt{T000}--\texttt{T003}) and minor version are encoded. Firmware updates are incremental and strictly one-way, the tool automatically selects the next version based on the currently installed one, and downgrade paths are not supported.

-\subsubsection*{Firmware Flashing Procedure}
+Earlier releases of the firmware update utility included the corresponding C source code used to build the update binary. However, more recent versions of the utility are distributed only as precompiled binaries, without source code availability. Access to the original source-based versions is now limited and typically requires archival tools such as the Wayback Machine \footnote{https://web.archive.org/web/20250000000000*/https://www.estk.me/downloads/}.
+
+\paragraph{Firmware Flashing Procedure.}

 The update process proceeds as follows:
 \begin{enumerate}
@@ -210,7 +212,7 @@ The update process proceeds as follows:

 The update tool fails gracefully only under specific conditions. For malformed or incorrect \glspl{apdu}, the \gls{euicc} abruptly terminates the connection, leading to a \texttt{CardConnectionException} raised by the PC/SC stack. This maps to the \texttt{SCARD\_E\_NOT\_TRANSACTED (0x80100016)} error, which occurs when the host attempts to communicate over a closed or non-existent connection~\cite{corcoran_pcsc-lite_2025}.

-\subsubsection*{Reverse Engineering and Mutation Testing}
+\paragraph{Reverse Engineering and Mutation Testing.}

 Using insights gained from disassembly in Ghidra, a Python-based reimplementation of the update mechanism was developed and available via the \gls{cli} as shown in \cref{sec:cli}. This allowed fine-grained control over the update process and enabled targeted mutation-based testing of firmware blocks.

@@ -242,6 +244,13 @@ The same mutation strategies as shown in \cref{subsubsec:mutation_engine}, such
 \section{Tracing}
 \label{sec:eval_tracing}

+
+\begin{figure}[t]
+    \includesvg[width=\textwidth]{Graphics/reSIMulate_setup.svg}
+    \caption{Tracing lab setup}
+    \label{img:trace_setup}
+\end{figure}
+
 To investigate vendor-specific behaviors in \gls{rsp}, we employed SIMTrace2 to capture the \glspl{apdu} exchanged between the \gls{lpa} running on a phone and the \gls{esim}. This enabled us to analyze the communication protocols used during profile management and \gls{euicc} interaction, especially focusing on the discovery and selection of the \gls{isdr}.

 During the analysis of the \texttt{eSIM.me} eSIM, we observed the use of a custom \gls{aid} during the SELECT command for \gls{isdr}. The following listing illustrates a sample trace captured while launching the \texttt{esim.me} Android application:
@@ -287,9 +296,6 @@ While tracing provides valuable insights into command sequencing and \gls{aid} s
 \section{Data Fuzzing}
 \label{sec:data_fuzzing_evaluation}

-\todo{Introduce different setups to make it more obvious when conducting seperate experiments} 
-
-
 \begin{table}[t]
    \begin{adjustwidth}{-.5in}{-1.5in} 
    \centering
@@ -314,7 +320,15 @@ While tracing provides valuable insights into command sequencing and \gls{aid} s
    \end{adjustwidth}
 \end{table}

-We conducted data fuzzing, as described in \cref{subsec:data_fuzzing}, on all tested \gls{esim} cards with the exception of \texttt{estk.me}. Each test case is executed sequentially across all eligible \glspl{esim} to ensure consistency and reproducibility of results.
+We conducted data fuzzing, as described in \cref{subsec:data_fuzzing}, on all tested \gls{esim} cards with the exception of \texttt{estk.me} as shown in \cref{img:fuzz_setup}. Each test case is executed sequentially across all eligible \glspl{esim} to ensure consistency and reproducibility of results.
+
+
+\begin{figure}[t]
+    \centering
+    \includesvg[width=.6\textwidth,inkscapelatex=false]{Graphics/reSIMulate_setup_fuzzing}
+    \caption{Fuzzing setup}
+    \label{img:fuzz_setup}
+\end{figure}

 The majority of the cards handled the fuzzed input data as expected, either processing the requests successfully or rejecting them gracefully with standard-compliant error responses. However, notable exceptions were observed during the execution of the \texttt{GetProfileInfo} test case as shown in \cref{tab:data_fuzzing_result}, particularly for the following cards:
 \begin{itemize}
@@ -403,8 +417,8 @@ To evaluate the robustness of \gls{rsp} protocol handling and smart card behavio
 \subsection*{Optimizing for Coverage}
 Initial fuzzing experiments revealed that applying aggressive mutations across all \glspl{apdu} early in a transaction significantly hindered code path exploration. In many cases, although a mutation in an early \gls{apdu} succeeded in provoking an altered behavior or state change, subsequent mutated \glspl{apdu} caused premature failures. As a mitigation strategy, we adopted a greedy mutation approach: once a mutation led to a successful state transition, the subsequent \glspl{apdu} in that session were executed unmutated to allow complete transaction processing and thereby maximize coverage.

-\subsection*{Experimental Setup}
-All tests are conducted with test profiles that use the \gls{gsma} Live CI and are offered by various \glspl{mno} \cite{welte_euicc_2024}. For consumer-grade \glspl{esim}, the following profile was used:
+\paragraph{Experimental Setup}
+All tests are conducted with test profiles that use the \gls{gsma} Live CI and are offered by various \glspl{mno}~\cite{welte_euicc_2024} and the hardware setup as shown in \cref{img:fuzz_setup} For consumer-grade \glspl{esim}, the following profile was used:
 \begin{center}
 \texttt{LPA:1\$rsp.truphone.com\$QR-G-5C-1LS-1W1Z9P7}
 \end{center}
@@ -416,32 +430,124 @@ For the \texttt{sysmoEUICC}, which uses the \gls{gsma} test certificate, the fol

 The execution time of fuzzed \gls{apdu} sequences varied depending on chip processing speed and the degree to which the mutated input triggered deeper protocol branches.

-\subsection*{Observed Errors}
-The following classes of errors were consistently encountered during mutation campaigns:
+\paragraph{Observed Errors}
+% The following classes of errors were consistently encountered during mutation campaigns:
+% 
+% \todo{Put errors into a table with example mutations and explain in text}
+% \begin{itemize}
+%     \item \textbf{SCP03TSecurityError}: Occurred during the \texttt{\justify LoadBoundProfilePackage} step, particularly when transmitting \texttt{sequenceOf86}, \texttt{sequenceOf88}, or the initial \texttt{sequenceOf87}. This indicates failure during Secure Channel Protocol 03 (terminal-side variant) session establishment.
+%     
+%     \item \textbf{ApduException}: Triggered by malformed \gls{asn1} structures, typically due to mutations altering length or tag fields.
+% 
+%     \item \textbf{InvalidCertificate}: Observed during \texttt{AuthenticateServer} and \texttt{PrepareDownload}. The \gls{euicc} rejected the server certificate during validation.
+% 
+%     \item \textbf{InvalidSignature}: Raised exclusively during \texttt{\justify InitialiseSecureChannelRequest}, indicating that the \gls{euicc} failed to verify the signature required for secure channel establishment.
+% 
+%     \item \textbf{UnsupportedRemoteOpType}: Also restricted to \texttt{\justify InitialiseSecureChannelRequest}. Mutation operators such as \texttt{ZERO\_BLOCK} or \texttt{TRUNCATE} corrupted the operation type field, which is normally set to \texttt{\justify installBoundProfilePackage (1)}.
+% 
+%     \item \textbf{UnsupportedCurve}: Introduced via bit-level mutations affecting certificate parameters. The \gls{euicc} did not support the altered elliptic curve definition.
+% 
+%     \item \textbf{UndefinedError}: A non-specific error raised by the \gls{euicc} during \texttt{AuthenticateServer} and \texttt{FirstSequenceOf87}, implying the input could not be classified under known error types.
+% 
+%     \item \textbf{DecodeTagError}: Thrown by the Python-side \gls{asn1} decoder during parsing of malformed responses. A representative example:
+%     \begin{quote}
+%         \texttt{euiccSignature1: Expected OCTET STRING (tag '5f37'), but got '2474'. (At offset: 271)}
+%     \end{quote}
+%     This indicates an inconsistency between the \gls{ber}-encoded tag and its expected value, likely due to mutated length fields or premature truncation. Notably, this malformed \gls{apdu} was still accepted and responded to by the \gls{euicc}, albeit with corrupt data.
+% 
+% \end{itemize}

-\todo{Put errors into a table with example mutations and explain in text}
-\begin{itemize}
-    \item \textbf{SCP03TSecurityError}: Occurred during the \texttt{\justify LoadBoundProfilePackage} step, particularly when transmitting \texttt{sequenceOf86}, \texttt{sequenceOf88}, or the initial \texttt{sequenceOf87}. This indicates failure during Secure Channel Protocol 03 (terminal-side variant) session establishment.
-    
-    \item \textbf{ApduException}: Triggered by malformed \gls{asn1} structures, typically due to mutations altering length or tag fields.
+\begin{table}[ht]
+    \begin{adjustwidth}{-1in}{}
+        \centering
+        \caption{Overview of Observed Error Types by Functions and Mutation Types}
+        \label{tab:error_overview}
+        \begin{tabular}{|p{4cm}|p{5cm}|p{4cm}|}
+        \hline
+        \textbf{Error Type} & \textbf{Functions} & \textbf{Mutation Types} \\
+        \hline
+        ApduException & 
+        \texttt{authenticate\_server}, \texttt{firstSequenceOf87}, \texttt{get\_euicc\_challenge}, \texttt{get\_euicc\_info\_1}, \texttt{initialiseSecureChannel\-Request}, \texttt{prepare\_download}, \texttt{sequenceOf86}, \texttt{sequenceOf88} & 
+        bitflip, random\_byte, shuffle\_blocks, truncate, zero\_block \\
+        \hline
+        SCP03TSecurityError & 
+        \texttt{firstSequenceOf87}, \texttt{sequenceOf86}, \texttt{sequenceOf88} & 
+        bitflip, random\_byte, shuffle\_blocks, truncate, zero\_block \\
+        \hline
+        InvalidCertificate & 
+        \texttt{authenticate\_server}, \texttt{prepare\_download} & 
+        bitflip, random\_byte, truncate, zero\_block \\
+        \hline
+        InvalidSignature & 
+        \texttt{initialiseSecureChannel\-Request} & 
+        bitflip, random\_byte, truncate, zero\_block \\
+        \hline
+        UnsupportedRemote\-OpType & 
+        \texttt{initialiseSecureChannel\-Request} & 
+        truncate, zero\_block \\
+        \hline
+        UndefinedError & 
+        \texttt{authenticate\_server}, \texttt{firstSequenceOf87} & 
+        bitflip, random\_byte, truncate \\
+        \hline
+        ProfileInstallation\-Exception & 
+        \texttt{sequenceOf86} & 
+        truncate \\
+        \hline
+        InvalidTransactionID & 
+        \texttt{initialiseSecureChannel\-Request}, \texttt{prepare\_download} & 
+        bitflip, shuffle\_blocks \\
+        \hline
+        IncorrectInputValues & 
+        \texttt{initialiseSecureChannel\-Request}, \texttt{sequenceOf86} & 
+        bitflip, random\_byte, truncate \\
+        \hline
+        EuiccException & 
+        \texttt{get\_euicc\_challenge}, \texttt{get\_euicc\_info\_1} & 
+        shuffle\_blocks \\
+        \hline
+        DecodeTagError & 
+        \texttt{authenticate\_server} & 
+        truncate \\
+        \hline
+        UnsupportedCurve & 
+        \texttt{authenticate\_server} & 
+        bitflip \\
+        \hline
+        EuiccChallenge\-Mismatch & 
+        \texttt{authenticate\_server} & 
+        bitflip \\
+        \hline
+        \end{tabular}
+    \end{adjustwidth}
+\end{table}

-    \item \textbf{InvalidCertificate}: Observed during \texttt{AuthenticateServer} and \texttt{PrepareDownload}. The \gls{euicc} rejected the server certificate during validation.
+During the mutation-based differential testing campaing several distinct error types were observed, each arising from specific classes of input corruptions or malformed structures. The following paragraphs describe each identified error type, its context of occurrence, and its implications for secure channel establishment and profile installation.

-    \item \textbf{InvalidSignature}: Raised exclusively during \texttt{\justify InitialiseSecureChannelRequest}, indicating that the \gls{euicc} failed to verify the signature required for secure channel establishment.
+One of the most frequently encountered errors was \textbf{SCP03TSecurity\-Error}, which emerged during the execution of the \texttt{LoadBoundProfile\-Package} operation. Specifically, this error was triggered during the transmission of either \texttt{sequence\-Of86}, \texttt{sequence\-Of88}, or the initial \texttt{sequence\-Of87}, pointing to a failure in establishing the Secure Channel Protocol 03 session on the terminal side.

-    \item \textbf{UnsupportedRemoteOpType}: Also restricted to \texttt{\justify InitialiseSecureChannelRequest}. Mutation operators such as \texttt{ZERO\_BLOCK} or \texttt{TRUNCATE} corrupted the operation type field, which is normally set to \texttt{\justify installBoundProfilePackage (1)}.
+Another recurring issue was the \textbf{ApduException}, which was caused by malformed \gls{asn1} structures. These were typically introduced by mutations that altered either the tag or the length fields within the encoded data, leading to parsing failures during transmission.

-    \item \textbf{UnsupportedCurve}: Introduced via bit-level mutations affecting certificate parameters. The \gls{euicc} did not support the altered elliptic curve definition.
+The \textbf{InvalidCertificate} error was predominantly seen during the \texttt{AuthenticateServer} and \texttt{PrepareDownload} commands. In these cases, the \gls{euicc} explicitly rejected the provided server certificate, signaling a failure in certificate validation, possibly due to mutations in the certificate’s structure or encoding.

-    \item \textbf{UndefinedError}: A non-specific error raised by the \gls{euicc} during \texttt{AuthenticateServer} and \texttt{FirstSequenceOf87}, implying the input could not be classified under known error types.
+Closely related, the \textbf{InvalidSignature} error was observed exclusively during the \texttt{InitialiseSecureChannelRequest} command. Here, the \gls{euicc} failed to verify the digital signature, which is a crucial step for establishing a secure communication channel.

-    \item \textbf{DecodeTagError}: Thrown by the Python-side \gls{asn1} decoder during parsing of malformed responses. A representative example:
-    \begin{quote}
-        \texttt{euiccSignature1: Expected OCTET STRING (tag '5f37'), but got '2474'. (At offset: 271)}
-    \end{quote}
-    This indicates an inconsistency between the \gls{ber}-encoded tag and its expected value, likely due to mutated length fields or premature truncation. Notably, this malformed \gls{apdu} was still accepted and responded to by the \gls{euicc}, albeit with corrupt data.
+The \textbf{UnsupportedRemoteOpType} error also occurred solely in the context of the \texttt{InitialiseSecureChannelRequest}. This was typically induced by mutation operators such as \texttt{ZERO\_BLOCK} or \texttt{TRUNCATE}, which corrupted the operation type field. The expected operation type is normally \texttt{installBoundProfilePackage (1)}, but mutations resulted in unrecognized values.
+
+Another structurally induced error was \textbf{UnsupportedCurve}, which resulted from bit-level mutations in the certificate’s cryptographic parameters. Specifically, when the elliptic curve definition was altered, the \gls{euicc} responded with this error, indicating a lack of support for the mutated curve.
+
+A less specific but nonetheless significant error was \textbf{UndefinedError}, which occurred during the \texttt{Authenticate\-Server} and \texttt{FirstSequence\-Of87} steps. This error type indicated that the mutated input did not match any known failure pattern, resulting in a generic error classification by the \gls{euicc}.
+
+Finally, a critical issue on the decoding side was \textbf{DecodeTagError}, raised by the Python-based \gls{asn1} decoder. This error was produced during the parsing of malformed responses. For example, a representative decoder error was:
+
+\begin{quote}
+    \texttt{euiccSignature1: Expected OCTET STRING (tag '5f37'), but got '2474'. (At offset: 271)}
+\end{quote}
+
+Such decoding failures indicate inconsistencies between the expected \gls{ber}-encoded tags and the actual mutated values, likely due to truncated structures or corrupted length fields. Notably, in some cases, the \gls{euicc} still accepted and responded to these malformed \gls{apdu}s, albeit with corrupted payloads.
+
+To aid in the analysis and comparison, Table~\ref{tab:error_overview} summarizes these errors alongside example original and mutated payloads.

-\end{itemize}


 % successful mutations
@@ -591,79 +697,148 @@ A truncated \gls{apdu} with trailing padding bytes removed (e.g., \texttt{BF2000
 A single-bit mutation changed a byte from \texttt{0xA0} to \texttt{0xA1}. Despite the low-level alteration, the \gls{apdu} was accepted, suggesting a fallback to an alternate logical channel or slightly different interpretation of \gls{tlv} structure.

 \paragraph{AUTHENTICATE\_SERVER Truncation}
-In one case, the mutation engine truncated approximately 75\% of the \texttt{AuthenticateServerRequest} \gls{apdu}—removing critical fields such as \texttt{ctxParams1}, the digital signature, and certificate extensions (including \texttt{2.5.29.14}, \texttt{2.5.29.17}, \texttt{2.5.29.35}). The \gls{apdu} failed \gls{asn1} decoding due to inconsistent length indicators; however, after manual correction, analysis revealed that the missing fields did not prevent the \gls{euicc} from accepting the request under specific conditions.
+In one case, the mutation engine truncated approximately 75\% of the \texttt{AuthenticateServer\-Request} \gls{apdu}, removing critical fields such as \texttt{ctxParams1}, the digital signature, and certificate extensions (including \texttt{2.5.29.14}, \texttt{2.5.29.17}, \texttt{2.5.29.35}). The \gls{apdu} failed \gls{asn1} decoding due to inconsistent length indicators. However, after manual correction, analysis revealed that the missing fields did not prevent the \gls{euicc} from accepting the request under specific conditions.

 \todo{Try to find some more fuzzing statistics i.e runs, mutations per second, explored paths}

 \subsection{Certificate Validation Bypass}
 \label{subsec:certificate_bypass}

-Further experimentation revealed a state machine vulnerability in the \gls{rsp} implementation of \glspl{euicc} manufactured by Eastcompeace. This vulnerability allows partial bypass of the \gls{gsma} certificate chain validation under specific mutated sequences of \texttt{AuthenticateServerRequest} messages.
+% Further experimentation revealed a state machine vulnerability in the \gls{rsp} implementation of \glspl{euicc} manufactured by Eastcompeace. This vulnerability allows partial bypass of the \gls{gsma} certificate chain validation under specific mutated sequences of \texttt{AuthenticateServer\-Request} messages.
+
+According to the \gls{gsma} specification SGP.22~\cite{gsma_sgp22_2025}, the server authentication process is expected to follow a strict mutual authentication model:

-\subsubsection*{Expected Behavior}
-According to SGP.22:
 \begin{quote}
-``The Server (the entity providing the function, e.g., \gls{smdpp}) SHALL be authenticated first by the Client (the entity requesting the function). Authentication SHALL include the verification of a Server Certificate chain ending at an \gls{esim} \gls{ca} RootCA Certificate (section 4.5.2).''~\cite{gsma_sgp22_2025}
+ ``The Server (the entity providing the function, e.g., \gls{smdpp}) SHALL be authenticated first by the Client (the entity requesting the function). Authentication SHALL include the verification of a Server Certificate chain ending at an \gls{esim} \gls{ca} RootCA Certificate (section 4.5.2).''
 \end{quote}
-The server certificate $Cert_{Sa}$ must be verified against the \gls{gsma} root-of-trust $Cert_{CI}$, and the digital signature of the \texttt{AuthenticateServerRequest} must be valid.

-\subsubsection*{Observed Behavior}
-By combining a series of failed and mutated authentication requests, it was possible to trigger incorrect trust decisions by the \gls{euicc}:
+This implies that the server certificate $Cert_{Sa}$ must be validated against the trusted GSMA root-of-trust $Cert_{CI}$, and the digital signature contained in the \texttt{AuthenticateServerRequest} must be successfully verified.

-\begin{enumerate}
-    \item A first \texttt{AuthenticateServerRequest} containing a malformed or mutated certificate (e.g., bit-flipped data) is sent to the \gls{euicc}. As expected, this request fails.
-    
-    \item A second \texttt{AuthenticateServerRequest} is sent, using a valid profile but with the certificate $Cert_{Sa}$ truncated (i.e., digital signature and extensions removed). Surprisingly, the \gls{euicc} accepts this message and continues with the profile installation process.

-    \item \gls{asn1} decoding of the truncated certificate $Cert_{Sa}$ on the \gls{lpa} side fails due to missing signature fields, but the \gls{euicc} does not reject it, indicating that signature verification was skipped.
+To evaluate the robustness of this authentication flow, we formulate the following research questions:

-    \item Swapping the \texttt{SubjectPublicKeyInfo} in the truncated certificate $Cert_{Sa}$ with an attacker-controlled public key $SK_A$ still results in successful mutual authentication and secure channel establishment. This would imply that the \gls{euicc} did not use the attacker controlled public key, but instead reused the previously sent publlic key from the failed request, as otherwise the \gls{smdpp} would have not been able to verify the signature in the subsequent client authentication $\mathrm{Cert}_{Sa} \triangleleft \mathrm{Cert}_{CI}$ as shown in \cref{fig:authenticate_server_sd}.
-\end{enumerate}
+\begin{description}
+    \item[RQ1:] Can certificate chain validation be bypassed during \texttt{AuthenticateServerRequest} handling?
+    \item[RQ2:] Does the \gls{euicc} cache internal state across failed authentication attempts, and does this influence subsequent request handling?
+    \item[RQ3:] Is it possible to skip digital signature verification by exploiting session state reuse?
+\end{description}

-\begin{figure}[h!]
-	\centering
-    \includesvg[width=\textwidth]{Graphics/authenticate_server_sd.svg}
-    \caption{Sequence diagram of the authenticate server process.\cite{ahmed_security_2024}}
-    \label{fig:authenticate_server_sd}
+\subsection{RQ1: Certificate Truncation and Validation Bypass}
+\label{sec:rq1-truncation}
+
+Initial tests involved sending a malformed authentication request $AS_1$ containing a syntactically invalid server certificate. As expected, the \gls{euicc} rejected this message.
+
+However, a second request $AS_2$, sent immediately afterward, contained a valid certificate structure but with truncated content, specifically the digital signature and X.509 extensions were removed. Surprisingly, the \gls{euicc} accepted this truncated certificate and proceeded with profile installation. Notably, the LPA’s ASN.1 decoder rejected the certificate, suggesting that the signature field was indeed malformed or missing. This implies that the \gls{euicc} skipped validation of both the certificate chain and the certificate's signature.
+
+Thus, we conclude that certificate truncation can bypass mandatory verification mechanisms, answering \textbf{RQ1} in the affirmative.
+
+\subsection{RQ2: Session Caching and Cross-Request State Leakage}
+\label{sec:rq2-caching}
+
+To understand the acceptance of truncated messages, further experiments evaluated whether the \gls{euicc} reuses internal state from failed authentication attempts. In a typical experiment, a valid but malformed certificate $Cert_{Sa}$ was first sent in $AS_1$, followed by a truncated certificate in $AS_2$. If both requests were made to the same \gls{smdpp}, $AS_2$ succeeded, even though it did not include a verifiable public key or signature.
+
+In contrast, changing the \gls{smdpp} between requests (e.g., from \texttt{rsp.truphone.com} to \texttt{rsp-eu.redteamobile.com}) caused the same second request to fail, returning an \texttt{UndefinedError}. This behavior strongly suggests that the \gls{euicc} caches per-server authentication state, most likely including the server’s public key or related cryptographic material.
+
+Moreover, this cached state persisted even after power-cycling the device or removing and reinserting the eSIM, suggesting that the cache is stored in non-volatile memory within the \gls{euicc}.
+
+\begin{figure}[ht]
+    \centering
+    \includegraphics[width=0.8\linewidth]{figures/auth-bypass-cache.pdf}
+    \caption{Behavioral difference between same-server and cross-server request sequences. Truncation bypass succeeds only when $AS_1$ and $AS_2$ are sent to the same \gls{smdpp}, implying server-specific session caching in the \gls{euicc}.}
+    \label{fig:auth-bypass-cache}
 \end{figure}

-A deeper analysis of mutated \gls{apdu} payloads of the initial \texttt{AuthenticateServerRequest}, particularly in cases using different \gls{smdpp} servers, shows the following:
+Figure~\ref{fig:auth-bypass-cache} summarizes the behavioral divergence between same-server and cross-server requests, highlighting the role of cached state in the vulnerability.

-\begin{itemize}
-    \item The first mutation typically occurs in the tag of the \texttt{AuthenticateServerRequest}, where the correct \texttt{BF38} tag is flipped to an invalid \texttt{BE38}. Correcting this tag provides a valid \gls{asn1} structure and makes the remaining data decodable.
-    
-    \item In cross-server scenarios (different \gls{smdpp}), subsequent flips occur in fields such as \texttt{euiccPki\-ToBeUsed}, \texttt{server\-Certificate.\-serial\-Number}, \texttt{server\-Signature1}, and both the \texttt{euicc\-Challenge} and \texttt{server\-Challenge} components of \texttt{server\-Signed1}. This attempt will not be able to pass the bug.
-    
-    \item In same-server scenarios (same \gls{smdpp}), a similar flip in the tag is observed. Upon correction, mutations manifest in \texttt{euiccCiPKIdToBeUsed}, \texttt{serverCertificate.issuer}, \texttt{serverCertificate.serialNumber}, \texttt{serverSignature1}, and again in \texttt{serverSigned1.serverChallenge}. This scenario will be able to pass the truncation bug an successfully install the profile.
-\end{itemize}
+\subsection{RQ3: Signature Skipping via Truncation}
+\label{sec:rq3-signature}

-These observations further validate the theory that the certificate from the initial failed \texttt{AuthenticateServerRequest} is cached and reused by the \gls{euicc}. In both the same-server and cross-server mutation scenarios, the public key remains unmodified, yet the behavior diverges based on the prior session’s state. Additionally, certificate validation $\mathrm{Cert}_{Sa} \triangleleft \mathrm{Cert}_{CI}$ appears to be bypassed entirely, as the mutated certificate $Cert_{Sa}$ contents never represent a structurally valid or correctly signed certificate at any point in the provisioning flow.
+The third research question concerns the validity of the signature in the truncated certificate. In further tests, we replaced the \texttt{SubjectPublicKeyInfo} field in $AS_2$ with an attacker-controlled public key $SK_A$. Despite this modification, the \gls{euicc} continued to establish a secure channel—provided that $AS_1$ had been sent previously with a valid certificate.

-\subsubsection*{State Persistence Across Sessions}
-Further experiments showed that this vulnerability persists across sessions and even power cycles:
+This suggests that the \gls{euicc} did not verify the signature in $AS_2$ at all, and instead reused the public key from the initial, failed request. Importantly, if $AS_1$ was omitted or came from a different \gls{smdpp}, the truncated $AS_2$ was correctly rejected. This supports the hypothesis that signature validation is skipped when previous authentication state is available.

-\begin{itemize}
-    \item A failed mutated authentication is followed by a successful truncated certificate installation—even after the \gls{euicc} is physically removed and reinserted into the card reader.
-    
-    \item If different profiles from the same \gls{smdpp} are used (e.g., activation codes \texttt{QR-G-5C-1LS-1W1Z9P7} and \texttt{QRF-SPEEDTEST}), the bypass remains possible.
-    
-    \item However, if different \gls{smdpp} servers are used (e.g., \texttt{rsp.truphone.com} vs \texttt{rsp-eu.redteamobile.com}), the attack fails, and the \gls{euicc} returns an \texttt{UndefinedError}. This supports the idea that the \gls{euicc} reuses the public key from the previously sent but failed \texttt{AuthenticateServerRequest}.
-\end{itemize}
+\subsection{Security Implications}
+\label{sec:implications}

-To further probe the caching mechanism, additional tests were conducted in which the public key of the second \texttt{AuthenticateServerRequest} was substituted into the first request. Based on previous observations, it was hypothesized that this would allow a successful installation by aligning the reused state with the new request.
+The combination of certificate truncation, session caching, and signature skipping presents a serious violation of GSMA’s SGP.22 requirements. Server authentication can be bypassed under realistic conditions, enabling an attacker to impersonate a legitimate \gls{smdpp} server if they can trigger specific request sequences.

-However, these attempts consistently resulted in an \texttt{UndefinedError} exception. The same error occurred even when replacing the entire certificate with the one used in the second request. This strongly suggests that the \gls{euicc} caches more than just the certificate object, potentially including intermediate cryptographic material or session-specific internal state derived from the original malformed request.
+Specifically, the \gls{euicc} fails to:

-This reinforces the assumption that the \gls{euicc} maintains non-trivial persistent state across sessions, and that this state influences the acceptance of subsequent messages even when they originate from clean profiles or well-formed certificates.
+\begin{enumerate}
+    \item Enforce mandatory certificate chain validation against the GSMA Root CA.
+    \item Validate the digital signature of the \texttt{AuthenticateServerRequest} after initial failure.
+    \item Clear cryptographic session state after request rejection or power cycles.
+\end{enumerate}
+
+These behaviors not only violate Sections 4.5.1 and 4.5.2 of the GSMA SGP.22 specification, but also open a practical attack surface for persistent authentication bypass. Further analysis should investigate whether other fields, such as \texttt{serverSigned1} and \texttt{serverSignature1}, are also affected by state reuse.
+
+
+
+% \paragraph{Expected Behavior.}
+% SGP.22 describes the expected server-client authentication behaviour as follows:
+% \begin{quote}
+% ``The Server (the entity providing the function, e.g., \gls{smdpp}) SHALL be authenticated first by the Client (the entity requesting the function). Authentication SHALL include the verification of a Server Certificate chain ending at an \gls{esim} \gls{ca} RootCA Certificate (section 4.5.2).''~\cite{gsma_sgp22_2025}
+% \end{quote}
+% The server certificate $Cert_{Sa}$ must be verified against the \gls{gsma} root-of-trust $Cert_{CI}$, and the digital signature of the \texttt{AuthenticateServer\-Request} must be valid.
+% 
+% \paragraph{Observed Behavior.}
+% By combining a series of failed and mutated authentication requests, it is possible to trigger incorrect trust decisions by the \gls{euicc}:
+% 
+% \begin{enumerate}
+%     \item A first \texttt{AuthenticateServerRequest} $AS_1$ containing a malformed or mutated certificate (e.g., bit-flipped data) is sent to the \gls{euicc}. As expected, this request fails.
+%     
+%     \item A second \texttt{AuthenticateServerRequest} $AS_2$ is sent, using a valid profile but with the certificate $Cert_{Sa}$ truncated (i.e., digital signature and extensions removed). Surprisingly, the \gls{euicc} accepts this message and continues with the profile installation process.
+% 
+%     \item \gls{asn1} decoding of the truncated certificate $Cert_{Sa}$ on the \gls{lpa} side fails due to missing signature fields, but the \gls{euicc} does not reject it, indicating that signature verification was skipped.
+% 
+%     % \item Swapping the \texttt{SubjectPublicKeyInfo} in the truncated certificate $Cert_{Sa}$ with an attacker-controlled public key $SK_A$ still results in successful mutual authentication and secure channel establishment. This would imply that the \gls{euicc} did not use the attacker controlled public key, but instead reused the previously sent publlic key from the failed request, as otherwise the \gls{smdpp} would have not been able to verify the signature in the subsequent client authentication $\mathrm{Cert}_{Sa} \triangleleft \mathrm{Cert}_{CI}$ as shown in \cref{fig:authenticate_server_sd}.
+% \end{enumerate}
+% 
+% \begin{figure}[h!]
+% 	\centering
+%     \includesvg[width=\textwidth]{Graphics/authenticate_server_sd}
+%     \caption{Sequence diagram of the authenticate server process.\cite{ahmed_security_2024}}
+%     \label{fig:authenticate_server_sd}
+% \end{figure}
+
+% A deeper analysis of mutated \gls{apdu} payloads of the initial \texttt{Authenticate\-Server\-Request}, particularly in cases using different \gls{smdpp} servers, shows the following:
+% 
+% \begin{itemize}
+%     \item The first mutation typically occurs in the tag of the \texttt{Authenticate\-Server\-Request}, where the correct \texttt{BF38} tag is flipped to an invalid \texttt{BE38}. Correcting this tag provides a valid \gls{asn1} structure and makes the remaining data decodable.
+%     
+%     \item In cross-server scenarios (different \gls{smdpp}), subsequent flips occur in fields such as \texttt{euiccPki\-ToBeUsed}, \texttt{server\-Certificate.\-serial\-Number}, \texttt{server\-Signature1}, and both the \texttt{euicc\-Challenge} and \texttt{server\-Challenge} components of \texttt{server\-Signed1}. This attempt will not be able to pass the bug.
+%     
+%     \item In same-server scenarios (same \gls{smdpp}), a similar flip in the tag is observed. Upon correction, mutations manifest in \texttt{euiccCiPKId\-ToBeUsed}, \texttt{serverCertificate.issuer}, \texttt{server\-Certificate.serial\-Number}, \texttt{serverSignature1}, and again in \texttt{server\-Signed1.server\-Challenge}. This scenario will be able to pass the truncation bug an successfully install the profile.
+% \end{itemize}
+
+
+% These observations further validate the theory that the certificate from the initial failed \texttt{AuthenticateServerRequest} is cached and reused by the \gls{euicc}. In both the same-server and cross-server mutation scenarios, the public key remains unmodified, yet the behavior diverges based on the prior session’s state. Additionally, certificate validation $\mathrm{Cert}_{Sa} \triangleleft \mathrm{Cert}_{CI}$ appears to be bypassed entirely, as the mutated certificate $Cert_{Sa}$ contents never represent a structurally valid or correctly signed certificate at any point in the provisioning flow.
+% 
+% \paragraph{State Persistence Across Sessions.}
+% Further experiments showed that this vulnerability persists across sessions and even power cycles:
+% 
+% \begin{itemize}
+%     \item A failed mutated authentication is followed by a successful truncated certificate installation—even after the \gls{euicc} is physically removed and reinserted into the card reader.
+%     
+%     \item If different profiles from the same \gls{smdpp} are used (e.g., activation codes \texttt{QR-G-5C-1LS-1W1Z9P7} and \texttt{QRF-SPEEDTEST}), the bypass remains possible.
+%     
+%     \item However, if different \gls{smdpp} servers are used (e.g., \texttt{rsp.truphone.com} vs \texttt{rsp-eu.redteamobile.com}), the attack fails, and the \gls{euicc} returns an \texttt{UndefinedError}. This supports the idea that the \gls{euicc} reuses the public key from the previously sent but failed \texttt{AuthenticateServerRequest}.
+% \end{itemize}
+% 
+% To further probe the caching mechanism, additional tests were conducted in which the public key of the second \texttt{Authenticate\-Server\-Request} was substituted into the first request. Based on previous observations, it was hypothesized that this would allow a successful installation by aligning the reused state with the new request.
+% 
+% However, these attempts consistently resulted in an \texttt{UndefinedError} exception. The same error occurred even when replacing the entire certificate with the one used in the second request. This strongly suggests that the \gls{euicc} caches more than just the certificate object, potentially including intermediate cryptographic material or session-specific internal state derived from the original malformed request.
+% 
+% This reinforces the assumption that the \gls{euicc} maintains non-trivial persistent state across sessions, and that this state influences the acceptance of subsequent messages even when they originate from clean profiles or well-formed certificates.

 % \todo{Find out which parts are reused aswell -> serverSigned1, serverSignature}
 % \todo{serverSigned1 -> check if signature is verified}

-\subsubsection*{Implications}
+\paragraph{Implications.}
 The observed behavior violates \gls{gsma} security requirements, particularly in Sections 4.5.1 and 4.5.2 of SGP.22~\cite{gsma_sgp22_2025}, which mandate certificate chain validation and signature verification during server authentication. The following points summarize the core issues:

 \begin{itemize}
-    \item The \gls{euicc} reuses elements (possibly the previously received certificate or derived session keys) from failed \texttt{AuthenticateServerRequest} sessions.
+    \item The \gls{euicc} reuses elements (possibly the previously received certificate or derived session keys) from failed \texttt{AuthenticateServer\-Request} sessions.
    \item Certificate truncation alone does not prevent the \gls{euicc} from proceeding with secure channel establishment.
    \item The signature over the \texttt{tbsCertificate} is not validated after session caching has occurred.
 \end{itemize}
--- a/Chapters/Implementation.tex
+++ b/Chapters/Implementation.tex
@@ -50,16 +50,14 @@ Our tracing functionality comprises two main operations:
    \item \textbf{Replaying:} Replays previously recorded \glspl{apdu} sequences to an \gls{euicc} in a PC/SC card reader. It replaces context-specific identifiers and checks for discrepancies in response behavior.
 \end{itemize}

-\begin{figure}[h!]
-    \includesvg[width=\textwidth]{Graphics/trace_setup.svg}
-    \caption{Tracing lab setup}
-    \label{img:trace_setup}
-    \todo{Add \sysname onto pc image and reference this figure in text}
+\begin{figure}[t]
+    \centering
+    \includesvg[width=.7\textwidth,inkscapelatex=false]{Graphics/reSIMulate_class_tracer.svg}
+    \caption{Simplified overview of components.}
+    \label{img:class_tracer}
 \end{figure}

-\todo{Overview of software components}
-
-The implementation consists of several key components:
+The implementation consists of several key components as shown in \cref{img:class_tracer}:

 \begin{description}
    \item[\texttt{PcscLink}] A thin wrapper over the Python \texttt{pyscard} library~\cite{rousseau_pyscard_2025}, which abstracts away low-level communication with PC/SC-compatible card readers. It handles session establishment, \glspl{apdu}/\gls{tpdu} transmission, and automatic processing of status words such as \texttt{61XX} (i.e., triggering \texttt{GET RESPONSE} when necessary).
@@ -72,7 +70,7 @@ The implementation consists of several key components:

    \item[\texttt{recording}] An abstraction for a recorded session. It stores the list of \glspl{apdu}, associated source and target \texttt{\gls{isdr}} addresses, and metadata. It provides serialization functions for saving to and loading from disk, as well as validity checks to determine whether a recording is replayable.

-    \item[\texttt{replay}] Loads a saved \texttt{recording}, connects to the target \gls{euicc} via \texttt{PcscLink}, and replays each \glspl{apdu}. During replay, the source and target \gls{isdr} values are automatically substituted. The response status words from the target \gls{euicc} are compared against those from the original trace. Any mismatch is reported to highlight divergent behavior.
+    \item[\texttt{replayer}] Loads a saved \texttt{recording}, connects to the target \gls{euicc} via \texttt{PcscLink}, and replays each \glspl{apdu}. During replay, the source and target \gls{isdr} values are automatically substituted. The response status words from the target \gls{euicc} are compared against those from the original trace. Any mismatch is reported to highlight divergent behavior.
 \end{description}

 This modular structure allows for easy integration into both automated test pipelines and manual inspection tools, and lays the groundwork for both our mutation-based and structure-aware fuzzing techniques described in subsequent sections.
@@ -154,10 +152,10 @@ Due to the inability of the \texttt{tracer} implementation to accurately replay

 The \gls{lpa} is composed of multiple components:

-\paragraph{Card}
+\paragraph{Card.}
 Represents the \gls{euicc} currently inserted into the PC/SC card reader. Upon initialization, it scans the card for supported applications, identifying the applicable \gls{adf} through probing. This is necessary as eSIM-on-SIM implementations often use proprietary \glspl{adf}, diverging from the \glspl{adf} specified in the SGP.22 standard as we will evaluate in \cref{sec:eval_tracing}. The card object keeps track of the selected application to reduce unnecessary reselection and traffic.

-\paragraph{PC/SC Link}
+\paragraph{PC/SC Link.}
 This component is based on \texttt{pySim}'s \texttt{LinkBaseTpdu}. It establishes an exclusive connection to the PC/SC reader to maintain session state consistency, which is required due to the stateful nature of \gls{euicc} interactions. During initialization:
 \begin{itemize}
  \item The supported transmission protocol (T=0 or T=1) is detected.
@@ -165,8 +163,8 @@ This component is based on \texttt{pySim}'s \texttt{LinkBaseTpdu}. It establishe
 \end{itemize}
 It handles both \gls{apdu} and \gls{tpdu} transmission, automatically requesting additional data when status words such as \texttt{9FXX}, \texttt{61XX}, \texttt{62XX}, or \texttt{63XX} are encountered. When enabled, it invokes an optional mutation engine before sending \glspl{apdu} (see \cref{subsec:apdu_fuzzing}) and also records all traffic for later analysis.

-\paragraph{Application}
-Each euicc application (e.g., \gls{isdr}, \gls{ecasd}, ESTK firmware update) is implemented with application-specific logic and communicates with the card via the \texttt{pcsc\_link}. The application layer abstracts encoding/decoding and command sending. For instance, the \texttt{store\_data} command is handled internally using \texttt{asn1tools} for encoding and decoding.
+\paragraph{Application.}
+Each \gls{euicc} application (\eg, \gls{isdr}, \gls{ecasd}) is implemented with application-specific logic and communicates with the card via the \texttt{pcsc\_link}. The application layer abstracts encoding/decoding and command sending. For instance, the \texttt{store\_data} command is handled internally using \texttt{asn1tools} for encoding and decoding.

 Known \glspl{adf} for \gls{isdr} observed during analysis:
 \begin{itemize}
@@ -180,10 +178,10 @@ To decoded response data for further processing, we use \texttt{pydantic} data c

 The \texttt{estk\_fwupd} application implements a proprietary firmware update interface, which we reverse-engineered (see \cref{sec:eval_tracing}). It supports reading the current firmware version, unlocking\footnote{This unlocking is distinct from \gls{gp}-defined unlocking, which allows the execution of generic \gls{gp} commands. See \gls{gp} Card Specification \cite{globalplatform_gp_2018}.} the \gls{euicc} for updates, and installing new binaries.

-\paragraph{Exception Handling}
+\paragraph{Exception Handling.}
 The SGP.22 standard defines a variety of response codes and error conditions. We map these response codes to custom exception classes in the \gls{lpa} implementation to enable precise error handling. This is essential for both debugging and for the differential testing framework to reason about diverging behavior across implementations. A code listing of the exception handling mappings is provided in \cref{sec:exception-handling}.

-\paragraph{SM-DP+ Client}
+\paragraph{SM-DP+ Client.}
 In addition to \gls{euicc} communication, the \gls{lpa} implementation must interact with the \gls{smdpp} server via the ES9+ interface. Our implementation uses \texttt{httpx} for HTTP interactions and adheres to the expected headers and structure as defined by SGP.22:
 \begin{lstlisting}[language=json,caption={ES9+ Request Headers}]
 {
@@ -281,13 +279,19 @@ The \gls{smdpp} client is primarily used by our \gls{isdr} application to execut

 To uncover behavioral differences between \gls{euicc} implementations, we implemented a fuzzing framework that mutates valid \glspl{apdu} generated via our custom \gls{lpa} implementation based on Design 2 in \cref{subsec:design_2}. Unlike the tracing-and-compare approach described earlier, the fuzzing strategy dynamically constructs valid request data and intentionally mutates it prior to transmission, allowing for meaningful analysis of error-handling behavior across cards.

-\subsubsection*{Fuzzing Scenarios and Execution}
+\paragraph{Fuzzing Scenarios and Execution.}

 We perform fuzzing through predefined \emph{scenarios}, which consist of ordered sequences of function calls targeting the \gls{euicc}. Each function within a scenario is executed via our custom \gls{lpa} implementation and serves as a potential mutation point. To ensure a consistent test environment, the scenario runner establishes a fresh PC/SC connection and resets the card into a clean state by invoking the \texttt{eUICCMemoryReset} operation. This includes processing all pending notifications and performing a full memory wipe prior to execution.

-To systematically track the fuzzing process, we developed an \textbf{operation recorder} that tracks every function invocation, the applied mutations, and the corresponding responses. This data is structured as a hierarchical \emph{mutation tree}, where each node represents a function call with a specific mutation applied. Each level in the tree corresponds to a function in the scenario, while sibling nodes denote alternative mutations of the same function. 
+To systematically track the fuzzing process, we developed an \textbf{operation recorder} that tracks every function invocation, the applied mutations, and the corresponding responses. This data is structured as a hierarchical \emph{mutation tree}, where each node represents a function call with a specific mutation applied. Each level in the tree corresponds to a function in the scenario, while sibling nodes denote alternative mutations of the same function. \cref{img:class_basic} shows how the \textbf{operation recorder} intregrates into \sysname.

-\subsubsection*{Mutation Engine}
+\begin{figure}[t]
+    \includesvg[width=\textwidth,inkscapelatex=false]{Graphics/reSIMualte_class_basic}
+    \caption{Simpplified class Diagram of the core classes.}
+    \label{img:class_basic}
+\end{figure}
+
+\paragraph{Mutation Engine.}
 \label{subsubsec:mutation_engine}

 We designed the mutation engine to support both \textit{deterministic} and \textit{random} mutation modes. It implements the following strategies for data transformation:
@@ -307,7 +311,7 @@ We designed the mutation engine to support both \textit{deterministic} and \text
 Deterministic mode ensures reproducibility by applying mutations at fixed, formula-derived offsets, whereas the random mode selects mutation targets probabilistically at runtime. Both modes behave similar to the deterministic and non-deterministic mutation modes used in AFLPlusPlus~\cite{fioraldi_afl_2020}.


-\subsubsection*{Fuzzing Workflow}
+\paragraph{Fuzzing Workflow.}

 Figure \cref{fig:scenario_flow} illustrates the \gls{apdu} fuzzing workflow, which we structured into four main steps:

@@ -335,7 +339,7 @@ We repeat this process for all functions defined in the scenario, producing a co
    \label{fig:tree_structure}
 \end{figure}

-\subsubsection*{Determine Next Mutation Logic}
+\subsubsection*{Determine Next Mutation Logic.}
 % shown in figure4 (flow graph on how to determine next mutation)
 % goals we want to try all mutations for each node
 % handled by operation recorder and next mutation is requeststed by pcsc link
@@ -378,15 +382,15 @@ This strategy is both exhaustive and progress-aware. It ensures that:
  \item The fuzzing process remains deterministic and resumable due to the structured tree format.
 \end{itemize}

-\subsubsection*{Error Handling and Retry Logic}
+\paragraph{Error Handling and Retry Logic.}

 Errors during execution are logged and associated with the current mutation node. If a function fails (e.g., due to protocol state loss or card reset), the runner resets the PC/SC link and the card, then resumes execution. This ensures that failures do not corrupt the mutation tree and allows exploration to continue.

-\subsubsection*{Scenario Persistence and Reuse}
+\paragraph{Scenario Persistence and Reuse.}

 To preserve fuzzing results, the entire mutation tree is serialized and stored using Python's \texttt{pickle} module in a \texttt{.resim} file. This enables post-analysis, comparison across card models, and reproducibility for future \gls{euicc} versions.

-\subsubsection*{Differential Testing}
+\paragraph{Differential Testing.}

 After multiple cards are fuzzed with the same scenario, their corresponding mutation trees are compared to identify behavioral discrepancies. This is done via depth-first traversal of the trees:

@@ -468,11 +472,16 @@ This differential testing method highlights edge-case inconsistencies across \gl
 % on the other hand an undefined error is still handled be the euicc but could not be properly handled -> could mean that there is a potential bug in the implementation and we need to do some further investigation into to this particular function call
 % -> euicc exceptions are ignored unless they are an UndefinedError

+% for each failer hypothesis automatically saves test fails to local file
+% saves input and prints hash to identify the failed test input
+% failed runs are automatically tested again on furture runs before generating new test cases
+% this allows us to test failed input against other cards when running the fuzzing against them -> differential testing
+
 While APDU-level fuzzing (see \cref{subsec:apdu_fuzzing}) is useful for evaluating command behavior across different \textit{euicc} implementations, it suffers from the drawback that random mutations, particularly at the bit or byte level, often invalidate the structured \gls{asn1} encoding. As a result, many \gls{apdu} mutations are immediately rejected as malformed, limiting the coverage and effectiveness of the test campaign.

 To address this limitation, we introduce a complementary \textit{data fuzzing} approach based on Design 3 in \cref{subsec:design_3}, that operates at the semantic level by fuzzing the input arguments of high-level \gls{lpa} function calls. This enables us to maintain structural validity while still exercising a wide variety of edge cases in the data provided to the \gls{euicc}. Our implementation builds on property-based testing frameworks designed for Python, in particular the \texttt{hypothesis} library~\cite{maciver_hypothesis_2019}.

-\paragraph{Fuzzing with Hypothesis}
+\paragraph{Fuzzing with Hypothesis.}
 Hypothesis is a property-based testing framework, which allows developers to define \textit{strategies} for input data. The framework then generates test cases based on these strategies and attempts to explore edge cases through randomized sampling and shrinking. Unlike traditional random fuzzing, Hypothesis ensures that generated inputs conform to the structural invariants defined by the strategy, thereby increasing the likelihood of discovering subtle logic errors in protocol handling.

 Hypothesis integrates seamlessly with \texttt{pytest} and uses the \texttt{@given} decorator to specify input generation strategies. For example, given the \gls{asn1} structure defined in the SGP.22 specification for the \texttt{Get\-Profile\-Info} function:
@@ -513,7 +522,7 @@ def test_get_profiles(self, use_iccid, profile_class, tags):

 This approach preserves the semantics and structure of the expected \gls{asn1} types while still allowing a wide variety of edge cases to be exercised.

-\paragraph{Implementation Scope}
+\paragraph{Implementation Scope.}
 Due to reliance on external infrastructure for the \gls{rsp} process, such as the \gls{smdpp} server, our fuzzing campaign focuses exclusively on the \gls{euicc}-side of the \gls{rsp} protocol. Invalid structured fuzzing requests directed at the \gls{smdpp} would lead to excessive traffic and could be misinterpreted as \gls{dos} attempts. Therefore, we restrict our tests to those functions defined in the ES10a, ES10b, and ES10c interfaces of the SGP.22 specification, which form the communication layer between the \gls{lpa} and the \gls{euicc}, specifically focusing on functions that accept structured input arguments and directly interact with the \gls{euicc}.


@@ -542,19 +551,21 @@ Specifically, we implemented fuzzing tests for the following functions:
    \end{itemize}
 \end{itemize}

-\paragraph{Fuzzing Lifecycle}
+\paragraph{Fuzzing Lifecycle.}
 During the \texttt{setUpClass} phase, a PC/SC link is initialized, and the \gls{euicc} is prepared (\eg, by installing a test profile) to ensure the preconditions for each function are met. After executing the class's test suite, the \texttt{eUICCMemoryReset} function is called with all reset options enabled to restore a clean state. All leftover notifications are processed to leave the card in a consistent state for subsequent tests.

-\paragraph{Error Classification}
+\paragraph{Error Classification.}
 According to the SGP.22 specification, many functions may return a generic \texttt{UndefinedError} in response to unexpected or malformed input. In our implementation, exceptions raised by the \gls{euicc} that map to well-defined error codes (i.e., subclasses of \texttt{EuiccException}) are not treated as test failures. These represent handled errors indicating that the input was invalid but the card responded appropriately.

 By contrast, when an \texttt{UndefinedError} is returned, we treat this as a potential indicator of an unhandled internal error or inconsistent implementation behavior. These cases are flagged for further investigation. Additionally, exceptions occurring outside the \gls{euicc}, such as Python \texttt{AssertionError}s or test harness failures, are treated as bugs in the testing infrastructure and are logged separately.

 \todo{Explain how we use differential testing in this context}

-\paragraph{Conclusion}
+\paragraph{Conclusion.}
 By combining property-based data generation with structural knowledge of \gls{asn1} types, we extend the fuzzing coverage of the \gls{euicc} interface beyond what is possible with \gls{apdu} mutation alone. This enables the discovery of semantic inconsistencies and unhandled corner cases in \gls{euicc} implementations, especially when compared across different vendors during differential testing as shown in \cref{sec:data_fuzzing_evaluation}.

+\textit{hypothesis} automatically records any failing test cases to local storage. For each failure, the corresponding input is saved and a unique hash is printed to allow reproducible identification of the triggering input. These previously failing test cases are automatically re-executed during future fuzzing runs prior to generating new test data. This mechanism enables us to efficiently validate whether the same input leads to diverging behavior across different \glspl{euicc}, thereby supporting systematic and automated differential testing.
+

 \section{CLI}
 \label{sec:cli}
@@ -587,8 +598,8 @@ The \gls{cli} is built using Python’s standard \texttt{argparse} module for ar

 The CLI structure is further detailed in \cref{sec:cli_structure}.

-\paragraph{Integration with Pytest}
+\paragraph{Integration with Pytest.}
 The data fuzzing component internally wraps \texttt{pytest}, leveraging the structure of Python test classes defined with the Hypothesis framework (cf. Section~\ref{subsec:data_fuzzing}). Each test class corresponds to a group of \gls{rsp} commands. By invoking the data fuzzing \gls{cli}, all available test classes are executed against the connected \gls{euicc}, with proper initialization and teardown logic handled automatically.

-\paragraph{Extensibility}
+\paragraph{Extensibility.}
 The \gls{cli} is designed with extensibility as a primary concern. Adding new commands requires minimal effort: developers only need to create a new subfolder, define a \texttt{run()} function, and register the new command in the main \gls{cli} dispatcher. Moreover, the \gls{cli} is completely decoupled from the core library logic, ensuring that library users are not forced to depend on the \gls{cli} subsystem and vice versa.
--- a/Graphics/reSIMualte_class_basic.svg
+++ b/Graphics/reSIMualte_class_basic.svg
--- a/Graphics/reSIMulate_class_slim.svg
+++ b/Graphics/reSIMulate_class_slim.svg
--- a/Graphics/reSIMulate_class_tracer.svg
+++ b/Graphics/reSIMulate_class_tracer.svg
--- a/Graphics/reSIMulate_setup.svg
+++ b/Graphics/reSIMulate_setup.svg
--- a/Graphics/reSIMulate_setup_fuzzing.svg
+++ b/Graphics/reSIMulate_setup_fuzzing.svg