wwwmgm8001 2



wwwmgm8001 1

Page 4

  • I found a OmeMax code on the Internet and modified the code into
    Spark way
  • Then I Run the modified demo on the spark cluster
  • 【wwwmgm8001】梦幻西游网页版无法在虚拟机上运行,mac上虚拟机win10与主机网络互ping。But I only run it on local mode successfully, the local mode means
    run the mode just on ome machine. When I run it on the cluster model
    it has some connection timeout bugs, so I still debugging and tuning
    the demo, and I will summary the tuning and debugging experiences



Page 8

    • Next is the details of the WITF model
  • 【wwwmgm8001】梦幻西游网页版无法在虚拟机上运行,mac上虚拟机win10与主机网络互ping。the model use crossdomain data to computer user vectors, Domains
    vectors and Virtual item vectors.

  • 【wwwmgm8001】梦幻西游网页版无法在虚拟机上运行,mac上虚拟机win10与主机网络互ping。During the processes, some vectors can be compute parallely, for
    example the each user’s vetor can be updated parallel, because it’s
    conditional indepence with other users. the Domian vectors and the
    constrict vectors have the simliar situation as user vectors, they
    all can be update parallely
  • After get those vector, The model will use the common measurement
    RMSE to computer the accuracy, then the accuracy will be the fitness



Page 9

  • Combine with WITF and Genetic algorithm on spark using the two-phase
    parallelization will be a problem, which name Spark RDD Nested
  • The best situation is consider each individual as a spark RDD
    element and then executed each individual’s fitness evaluation
    parallely on spark, but the fitness function WITF model will also
    use Spark RDD inside
  • So it has Spark RDD Nested, but Spark RDD do not support nested, I
    have to find an alternative which is Evaluate fitness sequentially
    ont parallely , the efficient depends on the speed of WITF on Spark
    and it needs to consider further



Page 1

  • Hello, Every one, this is Setsu. In this video, I will mainly talk
    about the Architecture of my proposed method.


wwwmgm8001 2

Page 3

  • At the begining, I will simplely introduce the OneMax Problem on
    Genetic Algorithm.
  • OneMax Problem’s final goal is find the Max individual, which is all
    of one, from some initial individuals which are made up by a series
    of 0 and 1.
  • Let us see the whole processes. First, there is a initial population
    with a number of individuals and the fitness of each individual is
    the total 1 number of the individual. Afer some genetic operations
    such as crossover, mutation and selection, there will be new
    generation population, then do the next genetic operations until it
    find the max individual which is all of 1
  • The simple example is usually used to test the efficient of genetic
    algorithm, so I try to run this genetic algorithm on spark to test
    the efficient and performance



Page 6

  • The parallel strategy of my proposed method is refered to a paper
    which published in 2017. This paper use genetic algorithm on spark
    to find optimal test case.
  • The paper proposed a two-phase parallelization. It contains parallel
    fitness evaluation and parallel genetic operations during the whole
  • When do parallel fitness evaluation, it computes each individual’s
    fitness value parallel. When do parallel Genetic operations, it dose
    each crossover, mutation and selection parallel.
  • With Using this two-phase parallel strategy on spark, it speed up