您当前的位置：首页 > FPGA硅农 ar

【强化学习】Sarsa(Lamda)算法

FPGA硅农发布时间：2021-12-15 18:54:12 ，浏览量：3

算法描述

代码实现

import numpy as np
import pandas as pd
import time

N_STATES = 25   # the length of the 2 dimensional world
ACTIONS = ['left', 'right','up','down']     # available actions
EPSILON = 0.7   # greedy police
ALPHA = 0.8     # learning rate
GAMMA = 0.9    # discount factor
MAX_EPISODES = 1000   # maximum episodes
FRESH_TIME = 0.00001    # fresh time for one move

def build_q_table(n_states, actions):
    table = pd.DataFrame(
        np.zeros((n_states, len(actions))),     # q_table initial values
        columns=actions,    # actions's name
    )
    return table

def build_e_table(n_states, actions):
    table = pd.DataFrame(
        np.zeros((n_states, len(actions))),
        columns=actions,
    )
    return table

def choose_action(state, q_table):
    state_actions = q_table.iloc[state, :]
    if (np.random.uniform() > EPSILON) or ((state_actions == 0).all()):  # act non-greedy or state-action have no value
        if state==0:
            action_name=np.random.choice(['right','down'])
        elif state>0 and state20 and state0 and state20 and state


    
        
            
        
        
            
                
                
                    FPGA硅农
                    暂无认证
                
            
            
                
                    
                        3浏览
                        0关注
                        244博文
                        0收益
                    

                    
                        0浏览
                        0点赞
                        0打赏
                        0留言
                    
                
            
            
                私信
                关注
            

        
        
            热门博文
            
                ASIC和FPGA设计流程
Karatsuba大数乘法的Verilog实现
Verilog实现占空比为5/18的9分频
【数字IC/FPGA】热独码检测
按键消抖的Verilog实现
FIR滤波器的Verilog实现
System Verilog实现优先级仲裁器
数字IC手撕代码--投票表决器
单端口RAM实现FIFO
【数字IC/FPGA】检测最后一个匹配序列的位置







    [ 申请 ]友情链接：
    
        
        优质稳定机场推荐

        绘画宝宝
        配音宝宝
    


    
        
            关于我们
            服务条款
            广告服务
            联系我们
            网站地图
            免责声明
            WAP
        
        技术支持：
            武汉快勤科技有限公司
            XML网站地图 
            备案号：鄂ICP备18027844号-9
            
        
    




    
        立即登录/注册
        
    
    
        
        微信扫码登录