This article gives a detailed description of the theories & calculation method used to generate ray propagation inside Speos Surface State Plugin based on a given possibility distribution function. This article also gives a brief description Speos Surface State Plugin and how to create functions propagating rays.
Possibility Distribution Function (PDF) and Inverse Probability Integral Transform
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events.
In raytracing, the bidirectional scattering distribution function (BSDF) radiometrically characterizes the scatter of optical radiation from a surface as a function of the angular positions of the incident and scattered beams. Due to the natural characteristics of a material rough surface, ray propagation (e.g. reflection direction) can understood as a random phenomenon and satisfying the distribution described by the BSDF.
Inverse probability integral transform (also known as inversion sampling, the inverse transform sampling, the inverse transformation method) is a basic method for pseudo-random number sampling, i.e., for generating sample numbers at random from any probability distribution. Mathematically, it means a method for generating random numbers from any probability distribution by using its inverse cumulative distribution \( F^{-1} (x) \). The cumulative distribution for a random variable is \( F_x(x)=P(X<x) \) .
PDF and Cumulative Distribution Function (CDF)
The behaviour of using cumulative distribution function can be illustrated with the following GIF. As presented inside the GIF animation, with the increasing sampling points along the vertical axis based on CDF (i.e. the plot in the centre), the distribution function on the horizontal axis is achieved.
using the Inverse probability integral transform, one and only one value at the horizontal axis can be found with a random value generated at the vertical axis. The inverse probability integral transform method can be simply achieved using linear interpolation as demonstrated by the image below:
The image above shows the interpolation calculation using CDF. With a given cdf value (vertical axis) generated in the range between [0, 1], a unique x value corresponding to such cdf value can be found in the horizontal axis. The calculation is done by finding the upper and lower bound of the cdf points, i.e. green and blue points in the example image above. Based on the interpolation position of random CDF between the green and blue points, the target x value is calculated based on the same interpolation value between green and blue points on axis.
Example: 1 Distribution Function PDF and 1D CDF
In the following simplified demonstration example, we assume that our computer can, on demand, generate independent realizations of a random variable with possibility distribution . At each random variable, there is a corresponding possibility value between [0,1], i.e. 0 as 0% while 1 as 100%.
Using random normal distribution function from python numpy module, the python script generates a number array which a normal distribution. The script is :
import numpy as np
val_min = 0
val_max = 1
variation = (val_max - val_min)/2
std_dev = variation/3
mean = (val_max + val_min)/2
dist_normal = np.random.normal(mean, std_dev, 1000)
the corresponding histogram plot for the value generated can be plot as below:
As expected, a possibility distribution is a normalized histogram plot. Thus, the cumulative distribution can be draw like below
As such, any possibility value between [0, 1], i.e. values at the vertical axis, has a corresponding value at its domain, i.e. values at the horizontal axis.
Then, based on the Inverse Probability Integral Transform method using CDF, we can run 1000 sample points with the given CDF function using the python script below.
import bisect
import os
CDF = [0.004, 0.014, 0.019, 0.033, 0.067, 0.123, 0.19, 0.288, 0.393, 0.524,
0.665, 0.782, 0.868, 0.923, 0.961, 0.981, 0.992, 0.997, 0.999, 1]
x_domain = [-0.02, 0.03, 0.08, 0.13, 0.18, 0.23, 0.28, 0.33, 0.38, 0.43,
0.48, 0.53, 0.58, 0.63, 0.68, 0.73, 0.78, 0.83, 0.88, 0.93]
def inverse_tranform(cdf_val):
cdf_id = bisect.bisect_right(CDF, cdf_val)
if cdf_id == 0:
# val, CDF[0], CDF[1]
# x, x_domain[0], x_domain[1]
interpolation_coef = (cdf_val - CDF[0]) / (cdf_val - CDF[1])
return (x_domain[0] - interpolation_coef * x_domain[1]) / (1 - interpolation_coef)
else:
# CDF[0], val, CDF[1]
# x_domain[0], x, x_domain[1]
interpolation_coef = (CDF[cdf_id - 1] - cdf_val) / (CDF[cdf_id - 1] - CDF[cdf_id])
return x_domain[cdf_id - 1] - interpolation_coef * (x_domain[cdf_id - 1] - x_domain[cdf_id])
def demo():
total_samples = 1000
for i in range(total_samples):
cdf_value = int.from_bytes(os.urandom(8), byteorder="big") / ((1 << 64) - 1)
x_value = inverse_tranform(cdf_value)
print(x_value)
demo()
The script above will produce 1000 outputs representing the x-axis values. You can copy those values into Excel sheet and draw a histogram plot. The result of inverse probability integral transform with sampling value of 1000 can be seen as below on the left. The difference between input and out is very small with only 1000 samplings. The difference (initial result shown on the right) will be even smaller at a much higher value of samplings.
Example: 2D Distribution Function PDF and 2D CDF
Overall, 2D case is the same with the 1D example, but working on a tabular data rather than a number array. Before moving a complex example, a 3x3 pixel example is used below to explain the calculation process.
Each pixel above has a position index representing its position in the space. Each pixel has its value represented in different value. In this simplified example, the pixel in the centre has the largest value using dark blue. The rest pixels have lower value using light blue.
Following the process to get cumulative function, horizontal cumulation is conducted firstly. As illustrated in the image below, colour on those pixels which are closer the right side become darker due to cumulation. At this stage, the value of each pixel equals to the sum of pixel on its left hand side. As a result, each horizontal line of pixels presents one CDF. With such CDF, it is possible to generate a PDF for each line of pixels.
Next, vertical cumulation is conducted. At this stage, only the pixel at the right edge is required, which produces a new CDF for vertical PDF. With such CDF, a vertical position can be generated, i.e. at line 1 or 2 or 3. Once the vertical position is calculated, using the corresponding CDF from the previous step will generate a horizontal position.
In the following example, we use a table of data with gaussian distribution defined with following parameters: Gaussian size in X as 5, Gaussian size in y as 5, height of gaussian as 0.5.
Running the script below, a 60 by 60 data table can be seen.
import numpy as np
import matplotlib.pyplot as plt
def gaussian(xy, x0, y0, sigma_x, sigma_y, H):
"""
2D gaussian function
Parameters
----------
xy : tuple of floats
tuple of x,y
x0 : float
x offset
y0 : float
y offset
sigma_x : float
Gaussian size in X
sigma_y : float
Gaussian size in y
H : float
height of gaussian
Returns
-------
float
value of gausian function at x,y with given sigma's and H
"""
x, y = xy
ax = 1 / (2 * sigma_x**2)
ay = 1 / (2 * sigma_y**2)
gaussian_value = H * np.exp(-ax * ((x - x0) ** 2) - ay * ((y - y0) ** 2))
return gaussian_value
x = []
y = []
z = []
for i in range(60):
for j in range(60):
x.append(j)
y.append(i)
z.append(gaussian((j, i), 30, 30, 5, 5, 0.5))
total_val = sum(z)
z = [item/total_val for item in z]
plt.scatter(x, y, c=z)
plt.colorbar()
plt.show()
z = np.array(z)
z = z.reshape((60,60))
for i in range(len(z)):
for j in range(len(z[i])):
if j!=0:
z[i][j]+=z[i][j-1]
plt.scatter(x, y, c=z)
plt.colorbar()
plt.show()
The image above has been normalized. The colour of each pixel represents the possibility value at its position.
As explained above, a 2D CDF is required before applying the inverse transfer sampling. In result below example, horizontal cumulation along the x axis is used. As a result, for each horizontal line we have one CDF as show in the example of 1D.
The next step is to calculate the vertical CDF. The last element of each horizontal CDF stands the sum of all the data point at that horizontal line. This can be displayed by the image below showing as a PDF:
After conducting vertical cumulation, a vertical CDF is achieved like above. To achieve the plots above, the following python script is used:
horizontal_CDF_last = z[:, -1]
vertical_axis = z[:60]
print(len(horizontal_CDF_last), len(vertical_axis))
vertical_CDF = []
for i in range(60):
if i != 0:
vertical_CDF.append(vertical_CDF[i-1] + horizontal_CDF_last[i])
else:
vertical_CDF.append(horizontal_CDF_last[i])
plt.plot(horizontal_CDF_last)
plt.show()
plt.plot(vertical_CDF)
plt.show()
Using the last CDF, we can find corresponding horizontal data set used at y-axis. Then, use the founded horizontal CDF data array, the x-axis data can be found.
Run the following python script, a re-constructed plot of 2D image using 60000 sampling points can be achieved:
def inverse_tranform(cdf_val, x_domain, CDF):
cdf_id = bisect.bisect_right(CDF, cdf_val)
if cdf_id == 0:
# val, CDF[0], CDF[1]
# x, x_domain[0], x_domain[1]
interpolation_coef = (cdf_val - CDF[0]) / (cdf_val - CDF[1])
return (x_domain[0] - interpolation_coef * x_domain[1]) / (1 - interpolation_coef)
else:
# CDF[0], val, CDF[1]
# x_domain[0], x, x_domain[1]
interpolation_coef = (CDF[cdf_id - 1] - cdf_val) / (CDF[cdf_id - 1] - CDF[cdf_id])
return x_domain[cdf_id - 1] - interpolation_coef * (x_domain[cdf_id - 1] - x_domain[cdf_id])
def Demo2():
sampling = 60000
z_values = np.zeros((60,60))
print(z_values)
y_position = []
x_position = []
for i in range(sampling):
df_value = int.from_bytes(os.urandom(8), byteorder="big") / ((1 << 64) - 1)
y_position.append(inverse_tranform(df_value, x, vertical_CDF))
df_value = (int.from_bytes(os.urandom(8), byteorder="big") / ((1 << 64) - 1)) * z[int(y_position[-1])][-1]
x_position.append(inverse_tranform(df_value, x, z[int(y_position[-1])]))
z_values[int(x_position[-1])][int(y_position[-1])] += 1
z_values.reshape(3600, )
plt.scatter(x, y, c=z_values, alpha=1)
plt.xlim([0, 60])
plt.ylim([0, 60])
plt.show()
Demo2()
Compared with initial plot on the right, the two results are nearly identical:
Implementation in Speos Surface State Plugin
The speos surface state plugin is a Speos surface property file which defines the surface response at a given incident ray. The plugin will be called during simulations, each time a ray is hitting a surface(s) that has plugin optical properties. The plugin will then use inputs from Speos to compute the ray updated info. The communication between speos and surface state plugin can be understood with the image below:
Key Function in Speos Surface State Plugin to Propagate Rays
Interaction function is a core function inside the Speos Surface State plugin. It is the function called by Speos each time a ray hits the surface, to compute the ray that exits the surface. The user can update the direction, energy and polarization of the ray. This function must return an int value, based on the output state of the ray: -2 for diffuse reflection, -1 for specular reflection, 0 for absorption, 1 for specular transmission and 2 for diffuse transmission. Thus, inside this function the PDF and CDF is utilized to find a ray direction at a given incident ray. A comparison is made between Speos simulation and Speos Plugin Simulation. As you can see, the results are nearly identical to each other.
Implementation in Zemax Surface Scattering Distribution
Similar knowledge is also introduced in the Zemax article: How to use tabular BSDF data to define the surface scattering distribution.