mirror of
https://sharelatex.tu-darmstadt.de/git/681e0e7a3a9c7c9c6b8bb298
synced 2025-12-07 05:08:01 +00:00
161 lines
13 KiB
TeX
161 lines
13 KiB
TeX
% !TeX root = ../Thesis.tex
|
|
|
|
%************************************************
|
|
\chapter{Design}\label{ch:design}
|
|
%************************************************
|
|
\glsresetall % Resets all acronyms to not used
|
|
|
|
% section introduces core design of our differential testing framework
|
|
% supports systematic analysis of mutliple commercial and test eUICC cards
|
|
% needs: reproduce and replay real-world interaction between euicc and lpa, mutate protocol-level inputs to explore edge cases and test for proper error handling, compare card responses under similar inputs to detect differences
|
|
% we propose a three layered design:
|
|
% 1. tracing and replay
|
|
% 2. APDU fuzzing and comparison
|
|
% 3. Data fuzzing
|
|
|
|
% Design 1: tracing and replay
|
|
% objective: capture real interaction sequences between an LPA and a target euicc -> enable deterministic replay of these recorded APDU sequences on different cards
|
|
% Design
|
|
% real world traces show how euiccs are used in practice -> also includes potential undocumented behaviour
|
|
% replaying the same APDU sequences accross multiple cards -> direct differential comparison
|
|
% setup must remain as close as possible to the actual communication path to path
|
|
% key design elements
|
|
% passive tracing: tracing component that intercepts APDUs exchange over a pysical interface using simtrace2; APDUs are associated with functional operations i.e select isd-r through partial classification
|
|
% structured recording: each recorded session includes metadata such as command classification, src and target aids
|
|
% replay engine: injects recorded APDUs into a session between an LPA running on an smartphone and the euicc; adjusts session specific fields i.e aids; diverging behaviour in responses or error codes are flagged
|
|
% why?
|
|
% tracing and replay -> provide rrealistic baseline for comparison and reproducibility without assuming full protocol compliance or specification access
|
|
|
|
% Design 2: APDU fuzzing
|
|
% objective: explore input space of the euicc rsp protocol stack by mutating valid APDUs and observing how cards respond to unexpected, malformed or corner-case inputs
|
|
% desgin rational
|
|
% valid traces may not expose edge-case behaviour or robustness against invalid input
|
|
% fuzzing tests implementations ability to reject malformed inputs and recover from protocol violations
|
|
% structured scenario runner allows controlled variation over function sequences
|
|
% recording of input and output allows for direct differential comparison
|
|
% design
|
|
% scenario-based execution: define sequences of euicc operations into scenarios -> anchor fuzzing process; scenario is built from high-level commands
|
|
% mutation-engine: APDU data is mutated using deterministic or randomized strategies similar to other industry standard fuzzers which inlcude mutations including bit-flips, zeroing data blocks, random byte replacement, block shuffling, truncation
|
|
% mutation tree representation: execution state and mutations are recorded in hierachical tree -> each node represents function call, input mutation, and oberved result -> enables exhaustive and resumable fuzzing runs
|
|
% exception aware runner: failures are isolated and retried with proper reset handling (reset card to original state) -> one invalid input does not compromise subsequent fuzzing steps
|
|
% comparison engine: responses different euiccs can be compared node-by-node; deviations are reported and visualized as a tree; deviating paths from normal behavior during the scenario run are visulaized
|
|
% why
|
|
% apdu fuzzing allows us to probe error-handling paths, specification and asn1 parser boundaries -> uncover differences that would remain hidden during standard execution
|
|
% deterministic strategies ensures reproducibility and side-by-side comparision with runs on other euiccs
|
|
|
|
% Design 3: data fuzzing
|
|
% fuzz with structurally valid inputs using property-based testing
|
|
% design rational
|
|
% explore the validity boundaries of specific protocol fields
|
|
% valid input parsin and robustness of euicc interfaces under syntactically valid but semantically unusual inputs
|
|
% automate test generation for application-layer API endpoints
|
|
% design
|
|
% type-aware input generation: generate complient payloads that respect expected schemas
|
|
% property-based fuzzing: test LPA application-level interfaces by producing wide range of structurally valid inputs
|
|
% no oracle required: focuses on detecting crashes, exception, or malformed responses instead of behavioral divergence between cards
|
|
% replayable tests: tests can be performed on mutliple cards and results compared
|
|
% why
|
|
% design enables in-depth testing of fromat correctness and schema adherence for specific interfaces
|
|
% complements other two strategies by focusing on structured input space exploration
|
|
|
|
This section introduces the core design of our differential testing framework \sysname, which supports systematic analysis of multiple commercial and test \glspl{euicc}. The goal is to provide a flexible and extensible platform capable of:
|
|
|
|
\begin{itemize}
|
|
\item Reproducing and replaying real-world interactions between \glspl{euicc} and \glspl{lpa},
|
|
\item Mutating protocol-level inputs to explore edge cases and verify robustness,
|
|
\item Comparing card responses under similar inputs to identify differences.
|
|
\end{itemize}
|
|
|
|
To achieve these goals, we propose a modular three-layered architecture:
|
|
|
|
\begin{enumerate}
|
|
\item \textbf{Tracing and Replay}
|
|
\item \textbf{APDU Fuzzing}
|
|
\item \textbf{Data Fuzzing}
|
|
\end{enumerate}
|
|
|
|
\section{Design 1: Tracing and Replay}
|
|
\label{subsec:design_1}
|
|
|
|
The first design focuses on capturing and replaying real interaction sequences between an \gls{lpa} and a target \gls{euicc}. This allows deterministic replay of recorded \gls{apdu} sequences on different cards for side-by-side comparison.
|
|
|
|
\paragraph{Design Rationale.} Real-world traces provide insights into how \glspl{euicc} are used in practice, including undocumented behavior not covered by specifications. Replaying identical \gls{apdu} sequences across cards enables direct differential testing. To ensure realistic conditions, the setup is designed to remain as close as possible to the original communication path between \gls{lpa} and \gls{euicc}.
|
|
|
|
\paragraph{Key Components.}
|
|
\begin{itemize}
|
|
\item \textbf{Passive Tracing:} A tracing module passively intercepts \gls{apdu} exchanges over a physical interface using \texttt{simtrace2}. Commands are partially classified and tagged with functional metadata, such as the selection of \gls{isdr}.
|
|
\item \textbf{Structured Recording:} Each session is recorded along with metadata, including command classifications, source and target \glspl{aid}, and session context.
|
|
\item \textbf{Replay Engine:} Captured traces are injected into a session between an \gls{lpa} and an \gls{euicc}. The engine adjusts session-specific fields (\eg, \glspl{aid}) and flags diverging behavior in the response status words or payloads.
|
|
\end{itemize}
|
|
|
|
\paragraph{Motivation.} This design provides a realistic baseline for comparison and reproducibility without requiring full specification access or assuming protocol compliance. It enables empirical analysis of protocol behavior under operational conditions.
|
|
|
|
\section{Design 2: APDU Fuzzing}
|
|
\label{subsec:design_2}
|
|
|
|
The second design focuses on exploring the input space of the \gls{euicc} \gls{rdp} protocol stack by mutating valid \glspl{apdu}. The aim is to test robustness against malformed, unexpected, or edge-case inputs and to expose implementation-level inconsistencies.
|
|
|
|
\paragraph{Design Rationale.} While real traces offer insight into typical usage, they often fail to reveal vulnerabilities related to invalid inputs. \gls{apdu} fuzzing is essential for testing the correctness of error handling and boundary enforcement.
|
|
|
|
\paragraph{Key Components.}
|
|
\begin{itemize}
|
|
\item \textbf{Scenario-Based Execution:} Scenarios are high-level sequences of \gls{euicc} operations (e.g., profile download) that anchor the fuzzing process.
|
|
\item \textbf{Mutation Engine:} Valid \glspl{apdu} are mutated using deterministic and randomized strategies, including bit-flipping, truncation, data zeroing, byte replacement, and block shuffling.
|
|
\item \textbf{Mutation Tree Representation:} The fuzzer constructs a hierarchical tree representing each function call, input mutation, and observed result, supporting exhaustive and resumable test runs.
|
|
\item \textbf{Exception-Aware Runner:} Each test is isolated, and card resets are used to restore a clean state, preventing a single failure from corrupting the session.
|
|
\item \textbf{Comparison Engine:} Results from multiple \glspl{euicc} are compared node-by-node. Deviations in status words, exceptions or data are reported and visualized to highlight divergent execution paths.
|
|
\end{itemize}
|
|
|
|
\paragraph{Motivation.} \gls{apdu} fuzzing allows systematic probing of error-handling logic, \gls{asn1} decoding boundaries, and specification ambiguities. The use of deterministic strategies supports reproducibility and enables direct comparison across different cards.
|
|
|
|
\section{Design 3: Data Fuzzing}
|
|
\label{subsec:design_3}
|
|
|
|
The third design targets application-level logic using structurally valid inputs. It leverages property-based testing to exercise schema-conformant payloads and detect semantic inconsistencies or robustness issues.
|
|
|
|
\paragraph{Design Rationale.} Data fuzzing explores the validity boundaries of specific protocol fields. Unlike raw \gls{apdu} mutation, it focuses on high-level, syntactically valid but semantically unusual inputs to stress the logic of the \gls{lpa}-\gls{euicc} interaction.
|
|
|
|
\paragraph{Key Components.}
|
|
\begin{itemize}
|
|
\item \textbf{Type-Aware Input Generation:} Payloads are generated according to type definitions and field constraints, ensuring compliance with expected formats.
|
|
\item \textbf{Property-Based Fuzzing:} A variety of structurally valid inputs are generated to test \gls{lpa} application-layer endpoints systematically.
|
|
\item \textbf{No Oracle Required:} Rather than expecting specific output, the tests flag crashes, exceptions, or malformed responses as anomalies.
|
|
\item \textbf{Replayable Tests:} Fuzzing inputs are recorded and replayable across multiple cards, enabling differential analysis and regression testing.
|
|
\end{itemize}
|
|
|
|
\paragraph{Motivation.} This design complements trace and \gls{apdu}-level fuzzing by shifting focus to semantic and structural correctness. It enables in-depth testing of parser robustness and adherence to data schemas within application-level interfaces.
|
|
|
|
\section{Design Comparison}
|
|
|
|
% all three design strategies present distinct approaches and tradeoffs as shown in \cref{tab:design-strategies}
|
|
% tracing and replay: focuses on precise behavioral comparison by replaying fully valid sessions -> ensures valid input but limits the exploration of unexpected inputs or edge cases
|
|
% APDU-level fuzzing: intruduces controlled mutations to valid APDUs -> expading coverage by probing implementaion-specific error handling and robustness; balances between generting diverse test-cases and preserving meaningful comparison across different euiccs
|
|
% structured data fuzzing: generates structurally valid inputs -> excels at identifying semantic inconsistencies and deeper behavioral differences
|
|
% together they provide a diverse and capable fuzzing framwork
|
|
|
|
\begin{table}[t]
|
|
\centering
|
|
\caption{Comparison of Design Strategies}
|
|
\label{tab:design-strategies}
|
|
\begin{tabular}{|l p{.25\textwidth} p{.25\textwidth} p{.25\textwidth}|}
|
|
\hline
|
|
\textbf{Design} & \textbf{Goal} & \textbf{Mutation Type} & \textbf{Input Validity} \\
|
|
\hline
|
|
Design 1 & Behavioral Comparison & None (Replay only) & Fully valid \\
|
|
\hline
|
|
Desing 2 & Protocol Robustness Testing & Byte-level mutations & Valid base, mutated \\
|
|
\hline
|
|
Design 3 & Semantic Boundary Exploration & Schema-level & Structurally valid \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{table}
|
|
|
|
Each of the three design strategies presented in this chapter targets a different dimension of the fuzzing and differential testing problem, offering complementary strengths and tradeoffs, as summarized in \cref{tab:design-strategies}.
|
|
|
|
\textbf{Tracing and Replay} focuses on the deterministic reproduction of real-world \gls{lpa}-\gls{euicc} sessions. By replaying fully valid \gls{apdu} sequences captured from live devices, this strategy ensures strict behavioral equivalence and reproducibility. However, it is limited in its ability to explore malformed or edge-case inputs.
|
|
|
|
\textbf{APDU-Level Fuzzing} extends this foundation by introducing structured mutations into valid \glspl{apdu}. It strikes a balance between input validity and exploratory depth, allowing the framework to probe robustness, error-handling routines, and implementation-specific divergences while still supporting comparative analysis across multiple \glspl{euicc}.
|
|
|
|
\textbf{Structured Data Fuzzing}, finally, operates at the semantic layer by generating well-formed but edge-case-rich inputs for application-level interfaces. This approach excels at uncovering logic flaws and inconsistencies in the parsing and interpretation of complex data structures, particularly those encoded in \gls{asn1}.
|
|
|
|
Combined, these desings form a comprehensive and modular fuzzing framework capable of both functional and robustness testing of commercial \gls{esim} and eSIM-on-SIM implementations. Their integration enables a wide coverage of the input space, from valid production-level traffic to syntactically and semantically malformed payloads, thereby supporting rigorous security and conformance evaluations. |